2021-05-12

Been spending bits of time on and off looking at how I can run certain tasks when TryML is first started... in the current context, specifically that means running the code in Startup.SetupMnistImageDataStructures() to load all the MNIST images from source files into in-memory data structures (basically to cache them in memory). As it stands this method isn't getting called until the Web API receives it's first HTTP request, which is no good because...

It means you don't discover exceptions pertaining to the caching (like the source files not being available) until a user has already made a request
It means the first request is always going to be quite slow to respond. The code actually loads and caches the 40MB+ surprisingly quickly, but ofcourse this would not be the case with 400MB of source data.

There are a number of good resources describing how to set this up... most suggesting creating a custom implementation of IStartupFilter. The problem for me was, that try as I might, I couldn't get any code to run before the first HTTP request... ultimately I put a breakpoint on the first line of Program.Main() out of frustration, and found it wasn't even entering this point until that first HTTP call!

Quite a lot of search engine digging later, I found this stackoverflow post, combined with some really good information in this article, which suggests the problem may be that I'm running on IIS with the 'In Process' hosting model. I currently don't have any 'AspNetCoreHostingModel' in my csproj file... the later article suggests that in that case it will default to 'Out of Process'... but the sample project linked to from the stackoverflow post seems to contradict??.. not sure. In any case, it's good to understand the difference between and implications of the two options. Next step for me will to get things working in 'Out of Process' hosting mode, and then hopefully get some more predictable behaviour wrt startup tasks.

2021-04-29

Github tag 0.6.0.0

With exception handling more or less sorted out (as per last post), I want to make sure that I had way to do proper content negotiation... i.e. I didn't want clients specifically requesting 'application/xml' or 'text/html', and for my web API to blindly return JSON. Researched a bit into ASP.NET Filters, and realised I could achieve this with a custom Resource filter... i.e. it could interrogate the request header, and check for 'application/json'. I could even hook it up to my error and status code converter classes by defining a ContentNotAcceptableException (or similar) and mapping that to a HTTP 406 status. That said, interrogating the request header is slightly trickier than you'd first expect as you have to handle quality values, and wildcards like '*/*'.

At the end of the day, none of this was necessary as ASP.NET already has a nice, built-in way of specifying that a 406 be returned if content negotiation fails... you just need to specify the following in your Startup.ConfigureServices() method...

services.AddMvc(options =>
{
    options.RespectBrowserAcceptHeader = true;
    options.ReturnHttpNotAcceptable = true; 
});

Having done this, if the client sends an 'Accept' header that doesn't contain 'application/json' or '*/*', the web API will respond with the following header (and empty response body)...

HTTP/1.1 406 Not Acceptable
Transfer-Encoding: chunked
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Date: Thu, 29 Apr 2021 03:22:39 GMT

I was tossing up whether it would be better to include a response body with more detail (which could be achieved with the exception and status code mapping technique described above)... e.g. an error object with a message like 'Cannot serve resource in any of the specified MIME types.'... but at the end of the day, is that really necessary??... the meaning of a 406 is pretty clear, and Swagger can be used to clearly document the web API's supported media/MIME types.

As part of the research on ASP.NET Filters, I also came across Exception filters... and this made me question whether my approach in using the IApplicationBuilder.UseExceptionHandler() method was the best / most appropriate. But it does seem, according to Microsoft's error handling guidelines that UseExceptionHandler() is preferred over an Exception filter.

Finally, I came across the 'Produces' attribute, which allows you to annotate your controller classes or methods as follows...

[ApiController]
[Produces("application/json")]
[Route("api/MnistImages")]
public class MnistImageController : ControllerBase
{
    (etc...)

I experimented a bit with this, but couldn't see that it changed the web API's behaviour... i.e. having the 'Produces' set or not, in combination with the setting the 'Accept' header settings defined above on or off doesn't seem to make any difference... not sure if 'Produces' is more used for Swagger generation?.. Microsoft documentation seems to suggest it should have an effect on setting a response format of a controller method... possibly it only comes into effect if multiple response formats are possible in the first place (e.g. JSON and XML), which is not the case for me.

OK, so I'm making some progress on small, but important things. Next will likely be trying to figure out how to have ASP.NET automatically 'warm up' on startup... e.g. to populate cached singleton objects on startup of the service, not on the first client call (as seems to be happening at the moment).

2021-04-11

Github tag 0.5.0.0

OK... I've made quite a few changes, so several things to write about. I mentioned last time about devoting time to getting error handling setup properly, so I should start by clarifying what the goals were in this regard. To me, having robust, consistent, and consumable error handling is as important as the primary function of an application, and in this context I wanted to achieve the following...

To utilize HTTP status codes appropriately
To make sure that in the case of both common client errors (e.g. incorrect arguments/parameters) and internal system errors that detailed information on the error was consistently conveyed back to the client
To try and adopt a public/standard/common JSON format if I could find one
To try to ensure the creation of JSON error objects, mapping errors to HTTP status codes, etc... interfered with the exception handling code and application code generally as little as possible

Error Object Format

The first step was to decide on a JSON format for the errors... I found two published standards... RFC 7807 and the 'error condition responses' section of Microsoft's REST API Guidelines. Beginning with the RFC format, there were a few things which I didn't like...

It recommends using URIs in a couple of its fields ('type' and 'instance'). Whilst I like idea of having error info documented/published in a central place, the downside is that a consumer has to do a second lookup (i.e. to the URI) to fully understand the nature of the error... I'd prefer that all the relevant information was included in the error itself. Also, adopting the URI reference approach, I would then have to publish the error details behind a URI somewhere, which just means more work (and maintenance).
From the examples given, it seems the JSON fields vary for different types of errors... e.g. the 'out-of-credit' error listed as an example has 'balance' and 'accounts' JSON properties. Logically, using custom fields to give extra detail about the error completely makes sense... .NET ofcourse does this in it's Exception class hierarchy (additional fields being added to classes derived from Exception like ArgumentException, etc...)... but in a non-typed context (i.e. when these errors are sent as stringified JSON over a network) this doesn't work as well... since to understand the extra fields and do anything with them programmatically, the client side needs to be pre-configured with the definition/formats of all the possible errors.

With the Microsoft format, I was initially a bit puzzled by the contents of the 'code' field... the guidelines say it should contain 'one of a server-defined set of error codes'... at first I was thinking I'd need to predefine a set of application specific codes (e.g. in an enum or similar)... this took me back to grey days of PRAGMA EXCEPTION_INIT in PL/SQL... I'm definitely in favour of clearly defining and codifying error types, but the maintenance of that and of designing them consistently (esp as an application grows) can be painful. But, after thinking about it .NET already has these 'server-defined error codes' built in... being the classes in the Exception hierarchy! So it seemed to me an Exception classes' type name would be a simple and appropriate value for this field... it also lends itself well to being extended, since anyone can define their own .NET exceptions specific to their application. The 'target' field is supposed to contain 'target of the particular error'... a little bit vague... but the .NET Exception class has the 'TargetSite' property which is populated with the method that the exception originated in, which to me seemed like a good match. The 'details' field I didn't agree with so much... it's supposed to contain an array of Error objects (same object as the field itself is defined in)... a few reasons why this didn't make sense to me...

The object already has an 'innererror' property which maps very nicely to .NET's Exception.InnerException (and the general concept of exception stacks in most languages). Having an additional property which stores causing errors seems like unnecessary duplication.
I get the idea of there being multiple, parallel (not hierarchical) errors that contributed to an overall single error... but in practical terms, how often does this come up? .NET has the AggregateException class which captures this, but as per the documentation it's used mostly in parallel computing contexts (TPL and PLINQ). I guess you could use it to aggregate multiple parameter/argument errors into a single package, but this would also be a pain to handle in code (I can see error handling routines at the top of methods becoming unnecessarily verbose).

At the end of the day I decided to drop the 'details' property... the eventual solution I built handles AggregateExceptions in any case, and aside from that I can't see that that field would be used in the majority of error handling scenarios. The 'innererror' property is a nice idea because as mentioned it's going to map well to exception stacks in most languages... the only part I didn't like was the way its wrapped in another object, see below (from the example in the guidelines)...

"innererror": {
"code": "PasswordError",
"innererror": {
  "code": "PasswordDoesNotMeetPolicy",
  "minLength": "6",
  "maxLength": "64",
  (etc...)
  }
}

... in the definition, the InnerError is a different object to the wrapping Error... why??... why not just make the 'innererror' property recursively hold another Error object (like the Exception.InnerException pattern). I decided to follow the .NET Exception pattern, so in my Error objects, the 'innererror' property holds another Error object.

Finally, the Microsoft guidelines show examples of InnerError objects with custom properties (e.g. 'minLength' and 'maxLength' in the above example)... as above... I agree with this in principle, but you've got the issue described above of communicating the format definition of these to clients for them to be programmatically useful. To me, a better approach is to have a collection of name/value pairs containing the additional error info... you still need some explicit knowledge of how to deal with the information in the pairs, but the benefit over custom field is that as a client you can consistently know where to access the information. So, in my format I included an 'attributes' property with an array of string name/value pairs. A C# ArgumentException expressed in my format looks like this...

{
  "error": {
    "code": "ArgumentException",
    "message": "Parameter 'label' contains invalid value '19'. (Parameter 'label')",
    "target": "GetByLabel",
    "attributes": [
      {
        "name": "ParameterName",
        "value": "label"
      }
    ]
    "innererror": null
  }
}

...to me, this format strikes a good balance between expressing enough detail about the error, whilst not being specific to .NET... i.e. you could easily create similar JSON objects from errors in Python, Java, etc... HttpInternalServerError is the .NET class which represents this error object.

Creating Errors from Exceptions

With the JSON format for errors decided on, I was thinking about how to create the equivalent error objects. Outside of the ASP.NET and Web API world, as a C# developer, you're used to creating exceptions to describe and notify of an erroneous situation in code, so I wanted to find a way to maintain this paradigm as much as possible, and to try to have the conversion to JSON done automatically. If I was to follow usual C# conventions, and have an Exception signify that something had gone wrong, then I'd need a way to convert standard Exception (and Exception-derived) objects to my HttpInternalServerError object. For this I built the ExceptionToHttpInternalServerErrorConverter class, which has a couple of nice features...

Includes conversion routines/functions for common .NET exceptions that would typically come up in Web API applications... Exception, ArgumentException, ArgumentNullException, etc...
Users can add and override conversion functions for exceptions
The class will traverse the Exception inheritance hierarchy looking for a conversion function which most closely matches the exception being converted

As per the last point, when attempting to convert exceptions, the class will step down the Exception inheritance hierarchy (using the Type.BaseType property), searching for conversion function which either matches or is as close a possible to the type of exception being converted. Since everything derives from Exception, in the worst case the conversion function for a plain Exception will be used.

Mapping to HTTP Status Codes

In addition to converting exceptions to an appropriate JSON format, I also wanted to map them to an appropriate HTTP status code. 500 (Internal Server Error) is suitable for more underlying and application specific exceptions, but exceptions that derive from ArgumentException (to me at least) map better to 400 (Bad Request). Hence I created class ExceptionToHttpStatusCodeConverter... similar to the ExceptionToHttpInternalServerErrorConverter above it traverses the Exception inheritance hierarcy and maps an exception to an HTTP status code. Also as above, it includes default mappings and the ability to override existing, or to add new mappings. As an example you could elect to map UnauthorizedAccessExceptions to 401 (Unauthorized).

Integrating into Controller Methods

Now I had a method to convert standard exceptions into corresponding HttpInternalServerError objects and HTTP status codes, but I wanted to find a way to execute this conversion as seamlessly and unintrusively as possible. Inside ASP.NET controller methods, there are some cases where you need to raise explicit Web API-specific errors... like in case of calling base method NotFound() to raise a 404. But for unexpected or more underlying or .NET specific exceptions, I wanted to avoid having excessive try/catch statements within the controller methods themselves, and similarly to avoid having HttpInternalServerError conversion and serialization code within the controller methods. It turns out that ASP.NET Core has a really nice mechanism for this... in the form of setting the IApplicationBuilder.UseExceptionHandler() method to set a custom exception handler which intercepts any thrown exceptions (as outlined here). In the application's StartUp.Configure() method I pass a lambda function to UseExceptionHandler() which...

Uses the above converter classes to find a HttpInternalServerError format and HTTP status code which maps to the thrown exception
Serializes the HttpInternalServerError to JSON
Sets the response content type to be JSON

... as per below...

app.UseExceptionHandler(errorApp =>
{
    var exceptionToHttpStatusCodeConverter = (ExceptionToHttpStatusCodeConverter)app.ApplicationServices.GetService(typeof(ExceptionToHttpStatusCodeConverter));
    var exceptionToHttpInternalServerErrorConverter = (ExceptionToHttpInternalServerErrorConverter)app.ApplicationServices.GetService(typeof(ExceptionToHttpInternalServerErrorConverter));

    errorApp.Run(async context =>
    {
        // Get the exception
        var exceptionHandlerPathFeature = context.Features.Get<IExceptionHandlerPathFeature>();
        Exception exception = exceptionHandlerPathFeature.Error;

        if (exception != null)
        {
            context.Response.ContentType = jsonHttpContentType;
            context.Response.StatusCode = (Int32)exceptionToHttpStatusCodeConverter.Convert(exception);
            HttpInternalServerError httpInternalServerError = exceptionToHttpInternalServerErrorConverter.Convert(exception);
            var serializer = new HttpInternalServerErrorJsonSerializer();
            await context.Response.WriteAsync(serializer.Serialize(httpInternalServerError).ToString());
        }
        else
        {
            // Not sure if this situation can arise, but will leave this handler in while testing
            throw new Exception("'exceptionHandlerPathFeature.Error' was null whilst handling exception.");
        }
    });
});

I also decided to register singleton instances of the ExceptionToHttpInternalServerErrorConverter and ExceptionToHttpStatusCodeConverter classes in ASP.NET Core's dependency injection, as this allows changing or adding to the conversion mappings at runtime.

Results

I'm pretty happy with the overall result. It means that in my controller methods I can just focus on success case code and some REST-specific error cases (like NotFound()/404 mentioned above), and that exceptions can be caught and dealt with exactly as they would be in regular (non ASP.NET) C#... knowing that if errors do occur, they will be communicated to clients in detailed and consistent way.

I should mention regarding security... potentially, in a live, publicly accessed Web API, you may not want to expose inner details of the application and code by returning the entire exception stack to arbitrary clients. At some point down the track I may need to look at some sort of filtering mechanism to selectively remove sensitive data from exception messages... but, I'm better to start off with a mechanism that returns verbose details, and allows for selective removal (as opposed to returning limited data and then not being able to access the full details when they are needed).

ASP.NET Exception Handling Method Comparison

Until I'd actually experimented with various different ways of throwing exceptions in controller methods (through the above process), it wasn't always clear what data would be returned, and in what format. For reference, 5 different methods of configuring and throwing exceptions within controller methods are compared below...

1. Uncaught Exception using the UseDeveloperExceptionPage() page option.

ASP.NET applications include the following statement in the Startup.Configure() method by default...

if (env.IsDevelopment())
{
    app.UseDeveloperExceptionPage();
}

With this set, throwing an uncaught exception within a controller method (and development context) will return HTML detailing the Exception in the below format...

HTTP/1.1 500 Internal Server Error
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Date: Mon, 29 Mar 2021 22:03:50 GMT
Content-Length: 24564

This is a nice convenience during development, but not suitable for production code (as the returned HTML is difficult to deal with programmatically).

2. Uncaught Exception without using the UseDeveloperExceptionPage() page option.

Commenting out the above UseDeveloperExceptionPage() statement results in the following header with no body/content being returned...

HTTP/1.1 500 Internal Server Error
Transfer-Encoding: chunked
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Date: Sun, 11 Apr 2021 05:48:29 GMT

3. Returning a 400 (Bad Request) using the ControllerBase.BadRequest() method.

The ASP.NET ControllerBase contains multiple methods to return specific HTTP status code responses. Calling the BadRequest() method, e.g...

return BadRequest($"Parameter '{nameof(label)}' contains invalid value '{label}'.");

returns a 400 with a textual description of the problem. I haven't tried, but expect you could also pass a JObject to the method to have JSON returned...

HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Date: Sun, 11 Apr 2021 05:43:02 GMT
Content-Length: 46

Parameter 'label' contains invalid value '11'.

4. Returning an RFC 7807-formatted response.

The ControllerBase.Problem method can be used to return an RFC 7807 error response. E.g. this code...

return Problem
(
    $"Parameter '{nameof(label)}' contains invalid value '{label}'.",
    null, 
    400, 
    "Parameter Error"
);

will produce the following response...

HTTP/1.1 400 Bad Request
Content-Type: application/problem+json; charset=utf-8
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Date: Sun, 11 Apr 2021 05:43:59 GMT
Content-Length: 197

{
    "type": "https://tools.ietf.org/html/rfc7231#section-6.5.1",
    "title": "Parameter Error",
    "status": 400,
    "detail": "Parameter 'label' contains invalid value '11'.",
    "traceId": "|4c6d16fe-4b976d7b41da2c87."
}

5. My HttpInternalServerError class.

Just for comparison, throwing an ArgumentException as below...

throw new ArgumentException($"Parameter '{nameof(label)}' contains invalid value '{label}'.", nameof(label));

...and having it handled by my converter classes, results in...

HTTP/1.1 400 Bad Request
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json
Expires: -1
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Date: Sun, 11 Apr 2021 05:41:34 GMT
Content-Length: 278

{
  "error": {
    "code": "ArgumentException",
    "message": "Parameter 'label' contains invalid value '11'. (Parameter 'label')",
    "target": "GetByLabel",
    "attributes": [
      {
        "name": "ParameterName",
        "value": "label"
      }
    ]
  }
}