Pages: 1 2 3 4 5 6 7 8 [9] 10 11 .. 11 :: one page |
|
Author |
Thread Statistics | Show CCP posts - 0 post(s) |
Dragonaire
Corax. The Big Dirty
41
|
Posted - 2012.04.26 14:21:00 -
[241] - Quote
I do understand the concern and when I get around to adding uploading to Yapeal there may be a few more sources showing up as well though EveMon probably will be the bigger change in numbers My main point is that UF was created to be a simple data exchange format not a routing and duplicate data detection system as how that is handled is up to the endpoint receiving the message. UploadKeys was added as an optional part that allows data sources and receiving endpoints to be able to pass a one way basic pre-agreed authorization or ID not to solve routing issues etc.
On the possible database load issue I have to ask a question here. Are you already seeing 1000s or 10000s of queries per second now? The reason I ask is I know with a well designed DB (correct indexes, and normalized) and well designed queries that a $15US a month single cloud node can handle that kind of load. If you are seeing less than a couple hundred per second and having problems now you need to be figuring out why. I know from my own experience in working on Yapeal that some very simple changes can make a huge difference especially with anything that does a lot of writes like Yapeal or anything trying to merge lots of data like the current endpoint have to. Hopefully no one that is using MySQL is trying to use MyISAM instead of InnoDB or TokuDB. The table locks and no transactions will kill you quickly. Finds camping stations from the inside much easier. Designer of Yapeal for Eve API. Check out the Yapeal PHP API library thread for more information. |
Desmont McCallock
169
|
Posted - 2012.05.01 18:03:00 -
[242] - Quote
What I haven't seen so far by any of you, is the usage of the specs URI for API entry points.
Quote:Uploads [SITEROOT]/api/upload
Syndication [SITEROOT]/api/syndicate |
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.01 18:47:00 -
[243] - Quote
I'm not sure I understand the point. There really aren't any details, and I'd need to be instructed as to how I should act differently under /syndicate vs. /upload. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Desmont McCallock
169
|
Posted - 2012.05.01 18:50:00 -
[244] - Quote
I'm pretty sure this was written without EMDR in mind. So I'm bringing this up, in order for the specs to be revised or come to an understanding. |
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
0
|
Posted - 2012.05.01 22:04:00 -
[245] - Quote
Hi, folks. I'm new to this discussion, so I hope I haven't missed any important details.
Desmont, I think it would help to consider actions instead of participants in your ABC model. Fundamentally, any given player can do any of the following:
- Produce a payload with new data
- Transmit a payload
- Receive a payload
A cache scraper will produce and transmit, while a relay receives and transmits, and a market analyzer only receives. Some kind of tool might do all three, but it shouldn't really matter because we should treat each action separately.
One step toward avoiding duplicates is to make sure that retransmitted payloads don't introduce differences. If there are any fields that are allowed to change in retransmission, those should be an explicit part of the standard.
Duplication comes in a few forms:
- Stored duplicates - the same payload is stored multiple times. A properly designed database should reject redundant rows, but there must be an efficient way to compare two payloads for equivalence.
- Transmitted duplicates - this is where relays and an EVEMon uploader can get us into trouble. If a few thousand payloads are uploaded per minute, and then those payloads start bouncing between relays indefinitely, we're in trouble.
If the storage problem can be solved, then a possible solution for the transmission issue is for each relay (or similar participant) to store some recent history, say the last 24 hours of payloads. Any incoming payload is checked against that cache, or if it's dated to longer than 24 hours ago, automatically rejected. Participants that store their entire history of payloads essentially have a perfect cache.
I don't think "destination tags" or other such routing information in headers is a good idea. Where the data goes doesn't mean anything to someone consuming the data. It's a bit too meta, and if the payload is retransmitted, we get the same payload sent around with different information about who's supposed to have it. Duplicates are still possible.
(Example 1: suppose A is an uploader, B and C are relays, and D is an aggregation site. A uploads to B and C, so the header has them marked down. Both B and C send their data along to D, whether it's pushed or pulled, and D is added to the header. However, it's still sent twice, and now D has a duplicate. This payload won't get trapped in a loop, since nobody will retransmit to D again, but anyone downstream of D gets the duplicates.)
(Example 2: a participant has multiple, functionally equivalent upload points. Excluding hostnames listed in the header won't protect against all duplicates.)
I think the best option is to prohibit modification of a payload. The formatting can changeGÇöwhite space, encoding, datesGÇöas long as it's still valid under the original specification. Payloads that differ in formatting only should still be resolved as identical. Now, this doesn't solve the "fifty people in Jita all look at Tritanium" issue, but those are different payloads and should be treated by the network as such. It's up to aggregation systems to decide how to handle such data.
On that topic, suppose a site wants to handle redundant data (from distinct payloads) by averaging all the prices from each day. If they then upload their payloads to a relay, they are creating a new payload. The "generator" value should reflect this.
By the way, I think there's plenty of room for interesting kinds of participants. Here's a hypothetical network to get you thinking:
- A: Cache scraper: Someone AFKs in Jita all day.
- B: EVEMon/Yapeal: A trader's orders are pulled from the API.
- C: Cached relay: This relay maintains a one-day cache to avoid duplicates.
- D: Aggregate relay: This relay keeps a record of every payload it's seen to avoid duplicates.
- E: Market analyzer: Averages data from other payloads and produces its own derivative data.
- F: Archive relay: Constantly retransmits the record of its stored payloads at a certain rate so that new participants can "catch up" on historical data.
Suppose A sends to C, and B sends to D. The relays C and D both send to E and F. E sends to F, and F sends to C, which additionally sends to D. (Draw a picture if you have to.)
If there's no duplicate checking, you can see how loops form, but this system works. Suppose A produces a payload. It goes to C, where it's added to the short-term cache and sent along to D, E and F. The payload is added to the long-term caches of all three. D sends the payload to E and F, but they reject it as a duplicate. In a couple days, F retransmits the payload to C, where it gets added to the short-term cache again as a new payload. However, when C transmits it to D, E and F, it's recognized as a duplicate and rejected as usual. It's a little additional bandwidth, but that's just a consequence of the network design (which is far from optimal). The new derivative payload created by E ultimately makes its way around to the various payload stores by way of F -> C -> D.
I think both push and pull capability is important for a network like this to function well, but not all nodes need to do both, and not all nodes need a local store.
(You can still do bad things with this. Imagine two cacheless relay nodes that are hooked up to each other. They'll bounce payloads between each other all day long, and they'll copy them out to other sources, constantly spamming them. This is a basic configuration error, though, not a problem with the underlying protocol. These spam nodes would be quickly blacklisted.)
Apologies if I misunderstand anything. I had a lot of thoughts as I read the thread and I thought I'd braindump a little.
|
Dragonaire
Corax. The Big Dirty
41
|
Posted - 2012.05.01 23:10:00 -
[246] - Quote
Packtu'sa - Thanks for the input and I think you understand the problem very well and why routing info wasn't part of the original format. you are also right in that looking at it in terms of actions is a better approach. Your idea with the relay maintaining a short term cache is a good one and would solve most of the issues everyone is worrying about. It can actually be useful also for sites that want to do day feeds to people. Finds camping stations from the inside much easier. Designer of Yapeal for Eve API. Check out the Yapeal PHP API library thread for more information. |
Desmont McCallock
172
|
Posted - 2012.05.05 14:04:00 -
[247] - Quote
BattleClinic has been added to UF supporting endpoints. |
Kaladr
Dreddit Test Alliance Please Ignore
31
|
Posted - 2012.05.07 05:54:00 -
[248] - Quote
EVE-Central (finally) has support for the unified format on the "correct" endpoints.
Next up: testing interoperability, and solving the routing-loop problem if we hook up to EMDR. Creator of EVE-Central.com, the longest running EVE Market Aggregator |
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.07 13:36:00 -
[249] - Quote
I just thought I'd point out a spec-related issue:
Quote: Uploads
[SITEROOT]/api/upload
Methods: POST (parameter = data), GET (parameter = data), PUT (RESTful)
I know at least nginx (and I think apache) have smallish limits on the maximum size that a GET param may be. I believe this was low enough to the point where most or all uploads will fail, if done via GET. Also, people really shouldn't be using a GET to do what should be a POST.
The other thing I had is to ask why we are using a 'data' parameter, instead of a non-formencoded POST body of JSON. Seems like that would be the simplest way to go. EMDR currently accepts uploads this way. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Desmont McCallock
172
|
Posted - 2012.05.07 14:08:00 -
[250] - Quote
From what I understood, dealing with the issue Ilyk has mentioned, a non-form encoded POST could be interpreted as PUT. Setting the method in a request as POST without using key-value pairs for the transmitted data is causing issues on the server because it's expecting key-value pairs.
I really would like Dragonaire to clarify this. |
|
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.07 15:30:00 -
[251] - Quote
Desmont McCallock wrote:Setting the method in a request as POST without using key-value pairs for the transmitted data is causing issues on the server because it's expecting key-value pairs. That's a library-specific quirk for whatever tools you're using. For example, this would post a non-form-encoded body using python-requests:
Quote: body = 'This is a body' response = requests.get("http://httpbin.org/post", data=body)
This only becomes form-encoded if we pass a key/value data structure (in the case of Python, a dict)
Quote: keyvals = {'some_key': 'some_val'} response = requests.get("http://httpbin.org/post", data=keyvals)
So, to follow the spec, I'd have to do this:
Quote: json_str = simplejson.dumps(market_data) keyvals = {'data': json_str} response = requests.get("http://httpbin.org/post", data=keyvals)
Which seems kind of silly, since all we really need is (a non-form-encoded body that is just JSON):
Quote: json_str = simplejson.dumps(market_data) response = requests.get("http://httpbin.org/post", data=json_str)
Hopefully this illustrates what I'm getting at better. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Kaladr
Dreddit Test Alliance Please Ignore
31
|
Posted - 2012.05.07 15:40:00 -
[252] - Quote
I followed the urlencode-in-form data approach due to inertia. Post is so commonly used for form submissions that not having encoded data seemed foreign.
The HTTP specs are of course agnostic on this matter. Creator of EVE-Central.com, the longest running EVE Market Aggregator |
Dragonaire
Corax. The Big Dirty
41
|
Posted - 2012.05.07 15:49:00 -
[253] - Quote
Need to use POST for the reasons found in this thread: http://stackoverflow.com/questions/630453/put-vs-post-in-rest To sum up since it's not create a new resource but modifying part of an existing one PUT does not make sense.
What we are missing is a common name for the key. How about UUDIF? So as an example: UUDIF={ "resultType" : "orders", "version" : "0.1alpha", "uploadKeys" : [{ "name" : "emk", "key" : "abc" }, { "name" : "ec" , "key" : "def" }], ...
That should make all the parsers happy.
Edit: You all were posting while I was write stuff I agree if we can work without having to do the whole key-value thing it would be nice but most libraries do seem to prefer having them and tend to make you work with the raw data directly without it which is not so nice Finds camping stations from the inside much easier. Designer of Yapeal for Eve API. Check out the Yapeal PHP API library thread for more information. |
Desmont McCallock
172
|
Posted - 2012.05.07 16:37:00 -
[254] - Quote
Currently me and Kaladr are using the word 'data' but in terms of naming I think we need something more descriptive rather than abbreviated.
examples: 'data=', 'message=' |
Dragonaire
Corax. The Big Dirty
41
|
Posted - 2012.05.07 17:28:00 -
[255] - Quote
I just know that some platforms have limits on length of names etc and thought it would be unlikely to be confused with something else. If we start adding other parameters like data=,message= we might as well not use the format to start with. We only need a single key-value to make it easier to work with for most parsers and it would be helpful to all use the same thing but beyond that it really doesn't matter what it is. Finds camping stations from the inside much easier. Designer of Yapeal for Eve API. Check out the Yapeal PHP API library thread for more information. |
Desmont McCallock
172
|
Posted - 2012.05.07 17:39:00 -
[256] - Quote
You misunderstood me. The examples are candidates to replace 'UUDIF' not for addition. |
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.07 19:30:00 -
[257] - Quote
What platforms really have to have a urlencoded form?
Dragonaire wrote: If we start adding other parameters like data=,message= we might as well not use the format to start with.
This is my reason for disliking the use of encoded POST bodies for our usage case, since we don't need to. A developer's inability to read documentation doesn't seem like a good justification for using form-encoded data.
Besides that, the wiki probably should be updated to not mention uploads using GET or PUT. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
5
|
Posted - 2012.05.07 19:59:00 -
[258] - Quote
To nitpick, must every endpoint have the "/api/upload" path? Why does the URI of an endpoint need to be defined in the spec? |
Callean Drevus
Icosahedron Crafts and Shipping Silent Infinity
120
|
Posted - 2012.05.07 20:03:00 -
[259] - Quote
I agree on the post body issue. The data should just be the full post body, form-encoding it is just silly. Developer/Creator of EVE Marketeer
|
Desmont McCallock
172
|
Posted - 2012.05.07 20:20:00 -
[260] - Quote
And the problem lies at the point where POST and PUT don't share common ways (syntactical that is).
Let me explain. Let's say that a site dev decides to support only the POST method (place your name here) and another one decides to support only the PUT method (let's say for the examples sake, EMDR) and another one decides to support both methods (EVE Central in this case, as I know it does). For all the above an uploader app dev (me), will have to specifically config each endpoint so that the UF message be send to each endpoint, with the endpoints supported method.
Conclusion: This does not make uploader app developing any easier.
Edit: Having worked with site devs that uses, from Python, to PHP and Scala, the PUT method seems the most convenient of all. |
|
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.07 21:53:00 -
[261] - Quote
Packtu'sa wrote:To nitpick, must every endpoint have the "/api/upload" path? Why does the URI of an endpoint need to be defined in the spec?
It probably doesn't need to be. Consistency is nice, but having differing hostnames kills that from the start, so we're not really any better off with consistent endpoint paths. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.07 22:00:00 -
[262] - Quote
Desmont McCallock wrote: Edit: Having worked with site devs that uses, from Python, to PHP and Scala, the PUT method seems the most convenient of all, for all.
It's trivial to support POST and PUT on my side at EMDR, which I can do. Technically, PUT is the "correct" way to upload market data, in this case.
That said, most well-designed libraries should be equally easy for PUT and POST. Here's a good Python example:
http://docs.python-requests.org/en/latest/api/#requests.head
You're basically just changing requests.post to requests.put. It shouldn't be any more difficult than that, especially if we're not form encoding. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Dragonaire
Corax. The Big Dirty
41
|
Posted - 2012.05.08 05:19:00 -
[263] - Quote
Ok I guess it wasn't clear from my link why PUT would be wrong so I'll try to explain it. PUT is for creating or updating an existing end point. So if you PUT to SITEROOT/api/item123.xml and can read it back from SITEROOT/api/item123.xml then you can use PUT but if you can't you need to use POST as it's not expected that you can get the data back from the same location with it. If you want to understand this more fully you'll need to read the RFCs related to both POST and PUT.
I updated the Wiki page while I was at it. Finds camping stations from the inside much easier. Designer of Yapeal for Eve API. Check out the Yapeal PHP API library thread for more information. |
Kaladr
Dreddit Test Alliance Please Ignore
32
|
Posted - 2012.05.08 05:52:00 -
[264] - Quote
As a note, when form encoding, please remember to actually form encode the contents. While most characters are safe in the message, the + in the timestamps gets decoded to a space, leading to an invalid date/time.
As for what key? Why not data. Its a nice four letter word, and I don't have to change anything Creator of EVE-Central.com, the longest running EVE Market Aggregator |
Desmont McCallock
172
|
Posted - 2012.05.08 06:06:00 -
[265] - Quote
Kaladr wrote:Why not data. Its a nice four letter word, and I don't have to change anything
Other's will have to make changes (me included), why not you? (humor)
|
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.08 14:07:00 -
[266] - Quote
EMDR will continue accepting uploads without form encoding, so if you're doing that, keep on keeping on. I'll get around to adding support for the silly form-encoded POSTs sometime. EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Kaladr
Dreddit Test Alliance Please Ignore
32
|
Posted - 2012.05.08 15:05:00 -
[267] - Quote
Desmont McCallock wrote:Kaladr wrote:Why not data. Its a nice four letter word, and I don't have to change anything Other's will have to make changes (me included), why not you? (humor)
Because I feel entitled. And have a pony.
I think the best course for me is to support omni-formats, namely the form encoding and raw payload. Support is relatively trivial.
(I do not actually have a pony) Creator of EVE-Central.com, the longest running EVE Market Aggregator |
Desmont McCallock
172
|
Posted - 2012.05.08 15:13:00 -
[268] - Quote
Quoting famous words:
Quote:Give people the option to choose and they will whine why you haven't given them more. -Desmont McCallock- |
Ilyk Halibut
Blackwater USA Inc. Against ALL Authorities
4
|
Posted - 2012.05.08 15:59:00 -
[269] - Quote
Kaladr wrote: I think the best course for me is to support omni-formats, namely the form encoding and raw payload. Support is relatively trivial.
That's what I'll be doing. And as you say, it's pretty easy to support both.
EVE Market Data Relay - A real-time feed of EVE Market data http://www.eve-emdr.com |
Packtu'sa
Nabaal Construction and Industrials Corp Nabaal Syndicate
5
|
Posted - 2012.05.08 16:52:00 -
[270] - Quote
If you have to use key/value pairs, and you're sending key/value data, (this next bit should be obvious) why not use the top level keys for the POST?
Granted, the existing format isn't conducive to that, but maybe that's food for thought. (I never was a fan of rowset/row structures in XML/JSON dumps.)
I'm not really a stakeholder at this point, so take my posts as musings, not requirements. |
|
|
|
|
Pages: 1 2 3 4 5 6 7 8 [9] 10 11 .. 11 :: one page |
First page | Previous page | Next page | Last page |