Improving fixed width file parsing
At work, I have been tasked with parsing fixed width files via a Windows service using C# that follow very specific patterns. One of the patterns is number fields are entirely numeric (e.g. a 5 character field with the final integer value of 10 would be 00010).
Span Parser
I don’t have a good reason to use Span
The first and easiest changes were to read the file byte by byte and use pooled arrays following Strings are evil as a guideline for improving file parsing performance. Unfortunately, each line splits according to different rules that are determined by the first 4 characters. I am currently adding each possible line parser (extending an interface to keep things generic) to a dictionary and processing the line if it has a match in the dictionary. I pass the line to the parser as a ReadonlySpan
Memory Allocations
With these steps, I’m mainly trying limit allocations, but obviously, allocations will be required from time to time. When the field is intended to be a string, the allocation is simply a new string of a character array. When the final result is an integer or decimal type, I don’t want to go through the hoops of converting to a string then parsing to an int or make any heap allocations (if they can be avoided). Basically I am going to loop through the string in reverse and add the numeric characters together (adjusting by their place).