These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Technology Lab

 
  • Topic is locked indefinitely.
 

Public CREST market history - date limits please?

First post
Author
Zifrian
The Frog Pond
Ribbit.
#1 - 2015-05-08 11:25:18 UTC  |  Edited by: Zifrian
I know you've said no before FF (because of caching on your end) but this would be really really nice. Right now to get say the last 7 days of information, we have to query 395 ish days of data, which is really inefficient. Furthermore, if I download an entire year today, tomorrow I have to do the same thing to get another day?

For one item this isn't too much of an issue but for my purposes I have like 2000 items (which is just BP items). I'm trying to limit the time to update the database and honestly, it's the DB inserts for data I don't need or want that are sucking up my time, not CREST rate limits.

This data is really useful and awesome to have access to. While I've been waiting for replicators like EMD to implement it, that hasn't happened yet from what I can tell. I'd rather not rely on other third party sites and apps if possible anyway, but this current system is a bit unwieldy.

Appreciate any feedback. I have a feeling I'm not alone.

Maximze your Industry Potential! - Download EVE Isk per Hour!

Import CCP's SDE - EVE SDE Database Builder

Polarina
EVE University
Ivy League
#2 - 2015-05-11 19:09:42 UTC
I am assuming that you store dates along with your historical data.

Would it not be possible for you to fetch from your database the most recent date, and then ignore all entries that are older than that when you loop over the result set in in your application in order to insert the data into your database?

Where are my clothes?

Aineko Macx
#3 - 2015-05-12 09:43:54 UTC
CREST doesn't and won't allow specifying ranges for requested collections "because of caching".
So you'll need to implement your own logic to deal with redundant data.

Nevertheless, I'd love a "past week only" flag for market history.
Zifrian
The Frog Pond
Ribbit.
#4 - 2015-05-13 02:48:12 UTC
Polarina wrote:
I am assuming that you store dates along with your historical data.

Would it not be possible for you to fetch from your database the most recent date, and then ignore all entries that are older than that when you loop over the result set in in your application in order to insert the data into your database?

Yeah that speeds it up a bit, but I still have to loop through all those records that I don't really need. And on days like right now, when CREST is really slow, it doesn't seem to matter anyway.

It seems like this is great for someone that runs a replication service but not sure on-demand users will have much use for it out side of one query at a time.

Maximze your Industry Potential! - Download EVE Isk per Hour!

Import CCP's SDE - EVE SDE Database Builder

Nahasaki
Warpspeed Shipping Inc.
#5 - 2015-05-17 21:55:07 UTC  |  Edited by: Nahasaki
Polarina wrote:
I am assuming that you store dates along with your historical data.

Would it not be possible for you to fetch from your database the most recent date, and then ignore all entries that are older than that when you loop over the result set in in your application in order to insert the data into your database?



Imagine the usecase of daily updates for some reason or another. Surely thats something many users will need.
Currently you'd specify a regionID & typeID and get the last years worth of data for that ID tuple.

For anyone doing daily updates for one or two regions that'll add up to thousands of api calls and throwing away more than 99.5% of the results.
Its absolute insanity not to be able to fetch the latest changes for an entire region in one go.

Allowing to specify date limits will interfere too much with caching. But a "give me the last 3 days for this region and all typeIDs" endpoint would work well with caching and most likely be a lot faster than the current way of doing it.
Selaria Unbertable
Bellator in Capsulam
#6 - 2015-07-06 14:58:58 UTC
*Bump*
Thought about opening a similar post, but since there's already one... I agree with Zifrian, a limit (maybe a separate url or a parameter) would be really helpful.

Polarina wrote:
I am assuming that you store dates along with your historical data.

Would it not be possible for you to fetch from your database the most recent date, and then ignore all entries that are older than that when you loop over the result set in in your application in order to insert the data into your database?


The problem isn't fetching the most recent data from the database, not even for the most recent day. The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.
Aineko Macx
#7 - 2015-07-07 12:32:46 UTC
Selaria Unbertable wrote:
The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.

I don't get it, start filtering and those inefficiencies are taken care of.

In this particular case I'm much more bothered about the wasted time, CPU and bandwidth. CCP might not care about the ergonomics of CREST outside of their own uses, but in this case the suboptimal API causes unnecessary load on their end too.
CCP FoxFour
C C P
C C P Alliance
#8 - 2015-07-07 13:32:54 UTC
Aineko Macx wrote:
Selaria Unbertable wrote:
The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.

I don't get it, start filtering and those inefficiencies are taken care of.

In this particular case I'm much more bothered about the wasted time, CPU and bandwidth. CCP might not care about the ergonomics of CREST outside of their own uses, but in this case the suboptimal API causes unnecessary load on their end too.


Not really. This eliminates almost all load on our end. NGINX is running at around 99.95% unused CPU and is doing all the page caching for us. If we allowed you to specify date ranges and things the request would have to go past NGINX to a CREST proxy, at which point it has to make another network call to the market node to get the market data, and then it has to filter it, then NGINX can cache that result with those query parameters. Considering the number of possible query parameter combinations if we allow you to specify a date... load on our end would go up a lot more.

@CCP_FoxFour // Technical Designer // Team Tech Co

Third-party developer? Check out the official developers site for dev blogs, resources, and more.

Zifrian
The Frog Pond
Ribbit.
#9 - 2015-07-07 13:50:15 UTC
CCP FoxFour wrote:
Aineko Macx wrote:
Selaria Unbertable wrote:
The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.

I don't get it, start filtering and those inefficiencies are taken care of.

In this particular case I'm much more bothered about the wasted time, CPU and bandwidth. CCP might not care about the ergonomics of CREST outside of their own uses, but in this case the suboptimal API causes unnecessary load on their end too.


Not really. This eliminates almost all load on our end. NGINX is running at around 99.95% unused CPU and is doing all the page caching for us. If we allowed you to specify date ranges and things the request would have to go past NGINX to a CREST proxy, at which point it has to make another network call to the market node to get the market data, and then it has to filter it, then NGINX can cache that result with those query parameters. Considering the number of possible query parameter combinations if we allow you to specify a date... load on our end would go up a lot more.

Understood. Is there any possible middle ground on this though? What about fixed date queries of say 1, 7, 30, 60, 90, 120 366 days? This limits it to just 7 possible queries and one is the current query already. This is how I limit EMD data from their API for market history. It's so incomplete though that I really, really want to use CREST but it takes about 3 minutes to load eveything. That's fine if it's the first download, but for daily use this gets impractical. If nothing else, just a 30 day query would be a huge help.

Maximze your Industry Potential! - Download EVE Isk per Hour!

Import CCP's SDE - EVE SDE Database Builder

Aineko Macx
#10 - 2015-07-07 16:00:11 UTC  |  Edited by: Aineko Macx
CCP FoxFour wrote:
NGINX is running at around 99.95% unused CPU

What a shame.

CCP FoxFour wrote:
If we allowed you to specify date ranges and things the request would have to go past NGINX to a CREST proxy, at which point it has to make another network call to the market node to get the market data, and then it has to filter it, then NGINX can cache that result with those query parameters. Considering the number of possible query parameter combinations if we allow you to specify a date... load on our end would go up a lot more.

That wasn't what was suggested as compromise. We get that the current architecture doesn't lend itself to requests with arbitrary ranges (or selected IDs). As Zifrian and others pointed out, simply having the option of limiting the history returned to a fixed timeframe like a week or a month would significantly cut down on the time and bandwidth required for a full scan while not exploding the number of items to be cached on your end.
Selaria Unbertable
Bellator in Capsulam
#11 - 2015-07-08 10:17:40 UTC
Zifrian wrote:
CCP FoxFour wrote:
Aineko Macx wrote:
Selaria Unbertable wrote:
The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.

I don't get it, start filtering and those inefficiencies are taken care of.

In this particular case I'm much more bothered about the wasted time, CPU and bandwidth. CCP might not care about the ergonomics of CREST outside of their own uses, but in this case the suboptimal API causes unnecessary load on their end too.


Not really. This eliminates almost all load on our end. NGINX is running at around 99.95% unused CPU and is doing all the page caching for us. If we allowed you to specify date ranges and things the request would have to go past NGINX to a CREST proxy, at which point it has to make another network call to the market node to get the market data, and then it has to filter it, then NGINX can cache that result with those query parameters. Considering the number of possible query parameter combinations if we allow you to specify a date... load on our end would go up a lot more.

Understood. Is there any possible middle ground on this though? What about fixed date queries of say 1, 7, 30, 60, 90, 120 366 days? This limits it to just 7 possible queries and one is the current query already. This is how I limit EMD data from their API for market history. It's so incomplete though that I really, really want to use CREST but it takes about 3 minutes to load eveything. That's fine if it's the first download, but for daily use this gets impractical. If nothing else, just a 30 day query would be a huge help.


I would suggest a similar approach. Instead of a date parameter, why not allow fetching data for fixed intervals (like 7, 14, 30 days). Would that work better?
Selaria Unbertable
Bellator in Capsulam
#12 - 2015-07-11 17:25:26 UTC
Aineko Macx wrote:
Selaria Unbertable wrote:
The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.

I don't get it, start filtering and those inefficiencies are taken care of.

In this particular case I'm much more bothered about the wasted time, CPU and bandwidth. CCP might not care about the ergonomics of CREST outside of their own uses, but in this case the suboptimal API causes unnecessary load on their end too.


Thought about this today again, and just implemented a simple date based filter for the results. Looks good, might speed up things significantly. Was a bit sceptical about this before, but it seems to work quite good. Didn't really expect that ;)
Zifrian
The Frog Pond
Ribbit.
#13 - 2015-07-11 22:41:07 UTC
Selaria Unbertable wrote:
Aineko Macx wrote:
Selaria Unbertable wrote:
The problem is updating the table in the database for each region/typeID combination. I don't even bother trying to filter anything, I just throw everything into the database using an "INSERT IGNORE..." query and have mysql sort it out. But this is highly inefficient, since about only a few thousand entries are added every day, and about a million is redundant.

I don't get it, start filtering and those inefficiencies are taken care of.

In this particular case I'm much more bothered about the wasted time, CPU and bandwidth. CCP might not care about the ergonomics of CREST outside of their own uses, but in this case the suboptimal API causes unnecessary load on their end too.


Thought about this today again, and just implemented a simple date based filter for the results. Looks good, might speed up things significantly. Was a bit sceptical about this before, but it seems to work quite good. Didn't really expect that ;)

How did you do your filter? DB look up or just filter the text as you loop through it? I'm not having much success.

Maximze your Industry Potential! - Download EVE Isk per Hour!

Import CCP's SDE - EVE SDE Database Builder

Aineko Macx
#14 - 2015-07-12 06:42:53 UTC
Simplest would be to fetch most recent date with history data from DB, skip all items in CREST response older than that when preparing the insert.

Coming from EMDR, where the DB data could become non-contiguous, I actually fetch all available dates within the range of the response and only insert the ones I don't already have.
Selaria Unbertable
Bellator in Capsulam
#15 - 2015-07-12 09:34:02 UTC
Aineko Macx wrote:
Simplest would be to fetch most recent date with history data from DB, skip all items in CREST response older than that when preparing the insert.

Coming from EMDR, where the DB data could become non-contiguous, I actually fetch all available dates within the range of the response and only insert the ones I don't already have.


My first thought too. Just some minor adjustments to the code, it seems to perform much faster.