Discussion:
Error code discussion
Lance Haig
2011-02-14 11:45:59 UTC
Permalink
As part of me trying to understand the store protocol so I can help
create the ruby binding Alex and i have had a chat about the current
error code solution within Bongo.

here is a copy of the IRC log as it happened.

[09:40] <LanceHaig> also If I want to go through all the c code and make
our error codes sequential will that be difficult?
[09:40] <LanceHaig> for a non C programmer?
[09:41] <so_solid_moo> virtually impossible, for anyone
[09:41] <so_solid_moo> I'm not against updating our error codes, but due
to the way code gets shared in the store, various commands will issue
the same error response
[09:42] <so_solid_moo> because it's coming from the same underlying bit
of code
[09:42] <LanceHaig> right
[09:42] <LanceHaig> ok
[09:43] <so_solid_moo> I think probably what we'd want to do is group
them more strongly
[09:43] <so_solid_moo> they're sort-of HTTPish, in that e.g. any 4xxx
code is a security issue
[09:43] <so_solid_moo> and any 5xxx is an internal error
[09:43] <LanceHaig> It is just easier to react and capture codes in my
opinion
[09:44] <LanceHaig> those would be easier I think
[09:44] <LanceHaig> I will make the binding be more contextual when
interpreting codes
[09:48] <so_solid_moo> this is something we could discuss on -devel
[09:48] <so_solid_moo> I'm a bit aware that our current error code
scheme is less than useful
[09:49] <so_solid_moo> what I'm thinking is that maybe we'd change the
codes to EEII format
[09:49] <so_solid_moo> where bindings care about the EE bit, but not the
II bit
[09:49] <so_solid_moo> so 2001, 2002, .. 2099 are all the same error
code as far as they are concerned
[09:49] <so_solid_moo> (as an example)
[10:55] <LanceHaig> right I can se that
[10:56] <LanceHaig> Will still need tounderstand :-) but that is me

Apart from my poor spelling as always i wanted to take Alex's advice and
bring this discussion to the list.

My take on this has come about since Baris and I have been working on
the ruby binding.

I started creating some error classes so that we could act on those
should they occur. in reading through the store protocol document I
noticed that there are some error codes that overlap e.g. 4226 which
means something different depending on what the context it is received in.

I would prefer that we have sequential numbers so that it is easier to
listen for the code and then do some stuff to either notify someone or
fix it.

Alex then explained as you can read above that the current code would
make it really difficult to accomplish an individual error code per
error as so much of it is reused in other parts of the store code.

He has made a suggestion above which I am sure he wil expand on here
that could make things clearer which I support.

How do people feel about the current scenario and the suggestions that
Alex has made?

Regards

Lance
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Alex Hudson
2011-02-14 12:09:06 UTC
Permalink
I think my take on this was that we probably ought to think less about
the store itself, and more about the protocol (although obviously the
store needs to implement this).

There are various different classes of error, and although the protocol
we've inherited is vaguely HTTP-like (in that 1xxx responses always
represent some kind of success, for example), it has really evolved
rather than been designed.

And of course, there are always going to be situations where you are
happy to ignore certain errors, and those aren't really grouped together
at all.

So I think the design ought to be along the following lines:

1. there is always a final status code, which is success or failure of
some description

2. there may be intervening content status codes (e.g. listing the
collection contents)

3. each command should have a small number of possible status
conditions.

At the moment, we have most of this in place already, it's more or less
a matter of cleaning stuff up. I also think that we ought to treat the
codes in two parts; the first being externally relevant and the second
being internally relevant (so each code is basically of the form XXYY,
and unless you're determining how something failed in the store, you
only care about the XX part).

As an example of how this might change these, take the CREATE command:

Results:

1. 1000 <collection guid> <create time> Created
2. 3010 Bad Arguments
3. 3014 Illegal Name
4. 3242 No store selected.
5. 4226 Collection Exists
6. 4227 GUID Exists
7. 5005 DB Library Error

(this is from the store protocol).

This could change to:

Results:

1. 10xx <collection guid> <create time> Created
2. 30xx Bad Arguments
3. 31xx Illegal Name
4. 32xx No store selected.
5. 40xx Collection Exists
6. 41xx GUID Exists
7. 50xx DB Library Error

Internally in the store, for a CREATE there may be a number of SQL
statements, so maybe the store would return 5003 if the third statement
failed. That would be useful information for developers debugging why a
command was failing when it ought to succeed, but simplifies the client
code because as far as it is concerned a 5001 error is exactly the same.

I would expect many of the same error codes would come out from other
commands: e.g. Bad Arguments is going to be common, so we would have to
look at whether or not we can make these codes consistent across the
entire store. It might be that one supplementary digit isn't enough and
that we actually need two.

Pat, any thoughts?

Alex.
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Loading...