Kevin Day [Fri, 30 Aug 2024 02:23:17 +0000 (21:23 -0500)]
Update: The FSS-0002 and FSS-0003 standards, modifying the space after Object rules.
My previous changes did not alter the behavior of the standard.
I spent some time considering this and decied that I should make this change.
The new behavior is that white spaces after the last printable character (aka "graph" character) in a valid Object is no longer considered part of the Object.
I decided to do this because supporting the space after the Object but not the space before the Object is awkward and also makes Object name matching more difficult.
One of the pillars of this project is "human first".
Doing this change makes it easier for a human to use by relaxing the exactness of a match when it comes to white spaces before or after a valid Object.
The specifications specifically include optionally supporting untrimmed Objects that include the white space before or after the Object to help accommodate the previous behavior.
I do not want to add quote support in the Object names here to keep it simple.
This new behavior seems to be a good compromise.
Kevin Day [Thu, 29 Aug 2024 03:25:33 +0000 (22:25 -0500)]
Update: Clarify the FSS-0002 and FSS-0003 standards regarding the white space before and after a valid Object.
I updated my 0.7 code and back ported some of the generated runtime tests.
I discovered problems and realized that the standard could be more clear according to the spaces before and after the Object.
This does not change the FSS-0002 and FSS-0003 standards in any functional way.
The simply clarifies the standard regarding the spaces to make it more clear and reduce the chances for a mistake.
Kevin Day [Wed, 28 Aug 2024 03:26:32 +0000 (22:26 -0500)]
Progress: Continue working on getting FSS Embedded Read working.
The runtime tests currently pass but I am not done with the changes.
Fix several problems with the Embedded List processing from fl_fss.
- The comments are now being handled correctly along with the `close`.
I noticed that the `--columns` could be opposing `--object` and `--content`.
This would make more sense as the combination of `--object` is meaningless.
I could also just throw a parameter error the `--columns` cannot be used with `--object`.
Then I could update the runtime tests to make more sense regarding the combination of `--object` and `--columns`.
Kevin Day [Tue, 27 Aug 2024 04:22:59 +0000 (23:22 -0500)]
Progress: Continue working on getting FSS Embedded Read working and tweak allocations.
The extended list needs a separate parameter and make the `line_start` match the `newline_last` at the start of the function.
Record `newline_last` and `line_start` whenever possible once a new line is found.
Use the variable `comment_start` to make the purpose more clear.
The `closes` array should be initialized before calling `fl_fss_extended_list_content_read()`.
Make sure the `closes` array value is reset when the Content is not found.
Make sure the `closes` array used length is incremented.
The FSS Read functions now handle printing the close strings.
I am now considering adding an `open` variable.
Doing this would then merit moving most things into the `f_fss_item_t`.
Tweak some of the allocations in other FSS functions for allocations that need explicit sizes (`f_memory_array_increase_by()`).
Kevin Day [Sun, 25 Aug 2024 22:07:47 +0000 (17:07 -0500)]
Progress: Continue working on getting FSS Embedded Read working and also add Object close support.
This starts the work for the handling of the depths.
I noticed the tests now pass despite the depths being incomplete.
Looks like I need to add some runtime tests for depths for both 0.6 and 0.7.
This brings in the runtime test expects from the 0.6 branch.
I have some brain storming to do so that I can determine how I want to handle the depth processing logic.
I decided that now is a good time to add support for the Object close.
This is to address the problem where I cannot print the original Object close for FSS Extended List and FSS Embedded List.
The reason being that the necessary data is not actually recorded.
I have not yet updated the unit tests.
I have not yet did any actual tests to confirm that this works.
I have not yet actually utilized this in the FSS Extended List Read and FSS Embedded List Read programs.
I have only made the low level changes and made sure everything compiles.
Kevin Day [Sat, 24 Aug 2024 03:18:33 +0000 (22:18 -0500)]
Progress: Continue working on getting FSS Embedded Read working.
I noticed that I forgot that I had intentionally set the size of the static array to 777 to try and trigger any problems.
I forgot about that and committed this in my previous progress commit.
This is now restored to a valid value.
This adds additional checks to the quotes and delimits arrays.
These are used in such a way that they much match the length of the Objects and Contents.
The current, incomplete, design with the FSS Embedded Read is copying over the Objects and Contents but has not yet properly set the delimits and quotes.
This exposes that there needs to be explicit checks between the loosely associated Objects and Contents with the delimits and quotes.
I noticed that the `--object` should always result in showing things even if the `--select` number is infinitely large but does not.
This is now fixed so that it always does this.
The runtime tests are updated as appropriate.
Kevin Day [Thu, 22 Aug 2024 04:15:13 +0000 (23:15 -0500)]
Progress: Begin working on getting FSS Embedded Read working.
The FSS Embedded Read is not working because nothing is actually implemented.
The problem is that the Nest structure is different from the standard Object and Content structure.
Only a single depth is to be processed.
This means that I can construct an Objects and Contents from the Nest based on the depth and pass that to the existing functions.
The initial work has been started but it is very much incomplete.
There is a lot of work to do in this regard.
Kevin Day [Wed, 21 Aug 2024 01:10:55 +0000 (20:10 -0500)]
Update: The runtime test files for FSS Embedded Read.
Update these based on the 0.6 branch that currently works.
I neglected this format during the transition and the FSS Embedded Read is currently non-functional.
The runtime tests should fail in most cases.
I will follow this change with the appropriate fixes once I identify and solve the problems.
The '}' gets printed on empty Content, which is invalid.
Set the range to out of range when the start position is the line start.
This should only happen for empty Content because valid Content is only closed when '}' is on its own line.
Move the internally managed allocation into externally managed allocation.
This allows for better memory control and optimization by the caller.
The use of `f_memory_array_increase()` is incorrect in several cases.
Switch to `f_memory_array_resize()`.
Add 2 when resizing to account for the depth position but also an additional element as a minor memory allocation optimization.
Get rid of headers that should not be included.
This does not address any of the other problems with the FSS Embedded Read functions.
Kevin Day [Sun, 18 Aug 2024 01:09:30 +0000 (20:09 -0500)]
Update: Improve FSS Read function correctness based on runtime unit tests.
This improves the runtime unit tests to be more correct as per the standard.
The standard allows for spaces after an Object.
There is no reason to print these extra spaces.
Add additional logic to better handle this behavior.
The FSS Basic Read and FSS Extended Read should not print this extra space after the Object when there is no Content to be printed.
The pipe mode should wrap empty Content with the start and end pipe characters.
This pope mode is still not well tested (or reviewed) and will eventually need further review and runtime tests.
The handling of `--empty` should only apply to when there is empty Content and not when there is no Content at all.
This adds new callback and flags to better handle these situations.
Update the `verify.sh` script (and associated `testfile`) to print the test name.
Update the `generate.sh` and `verify.sh` scripts to safely pass arguments with spaces using `"$@"`.
Kevin Day [Mon, 12 Aug 2024 02:51:40 +0000 (21:51 -0500)]
Bugfix: FSS Read functions are not handling the --columns and --empty properly.
When the `--columns` is specified, then the max columns should only be calculated if the used array size is greater than zero.
In addition, if the `--empty` is not specified, then the columns must also have non-empty Content.
For single Content standards, such as with `fss_basic_read`, the max was always being set to one even if the `contents.used` is zero.
This change will need to be back ported.
This includes updates to the runtime tests.
I still need to do some in-depth review.
I overlooked a situation during the release of the 0.6.11 branch regarding the `--empty` or lack thereof.
I will likely have to fix a bug and update all of the runtime tests in the 0.6.11 for the FSS Read programs.
Kevin Day [Sun, 11 Aug 2024 21:45:02 +0000 (16:45 -0500)]
Bugfix: Incorrect runtime tests for FSS Read programs.
I mass updated the FSS Read tests in the 0.7 branch.
I chose the quick route of just using the program to generate the tests.
This requires that I trust the results to be correct.
I figured I would eventually go through each one and make sure that they are correct at a later time.
It is a development branch, after all.
However, back porting these tests to the stable 0.6 branch revealed some bugs.
I reviewed the failing tests and files and I confirmed that the 0.6 branch is getting correct results and that the tests are incorrect.
The runtime tests in the 0.7 development branches will have to be updated and the bugs there will have to be fixed.
Kevin Day [Sat, 10 Aug 2024 03:54:14 +0000 (22:54 -0500)]
Update: Add comments in some of the fss_read runtime tests.
These comments should not end up in the results and so only the source files are changed.
This should help increase the chances of catching an fss_read bug for the particular standard.
The specific comments added would otherwise be valid Object and Content data except for being commented out.
Kevin Day [Tue, 6 Aug 2024 03:12:48 +0000 (22:12 -0500)]
Security: Missing range checks on comment processing.
The fss_payload_read such as the runtime test is wrong:
# fss_payload_read -ocn payload level_3/fss_read/tests/runtime/fss_000e/source/test-0002-mixed.fss -t
The output is 1 but should instead be 4.
# fss_payload_read -ocn payload level_3/fss_read/tests/runtime/fss_000e/source/test-0002-mixed.fss | wc -l
Investigating this problem revealed that the comment handling code is failing to perform a range check.
The overflow is causing the stop range to point to some random memory address which is almost always larger than the file.
This results in the count being wrong.
This bug is a security concern.
Add the range check in all places where this range check is missing for the comments.
Add additional runtime tests to reflect the condition that exposed this issue.
There is now a "payload" test for all runtime tests.
Update the testfile to make manually generating and verifying the runtime tests easier.
The "generate" and "verify" fakefile operations could not be directly called due needing additional data setup.
Also expose the "test-" setting as a parameter to make changing it easier.
I also overlooked some cases where I could perform the same optimization used for the referenced commit in some places.
Also use the literal ASCII characters rather than the strings.
The ASCII codes are required and expected and substitution of the characters for the algorithm do not make sense here.
These are characters rather than strings.
Kevin Day [Mon, 5 Aug 2024 01:14:19 +0000 (20:14 -0500)]
Bugfix: Incorrect settings in the fss_read runtime tests.
Several of the tests are "object" tests but use "content" data.
Some tests are both "object" and "content" tests but do not use both.
This is now updated, however there are too many tests to update and fix.
The output is simply re-generated, which to forces a success (even if they should fail).
I need to come back at a later time and review the output.
There are some known problems such as:
# fss_basic_list_read -oc -n hi -t level_3/fss_read/tests/runtime/fss_000e/source/test-0002-mixed.fss
Which results in a wrong count.
# fss_basic_list_read -c -n hi -t level_3/fss_read/tests/runtime/fss_000e/source/test-0002-mixed.fss
vs
# fss_basic_list_read -c -n hi level_3/fss_read/tests/runtime/fss_000e/source/test-0002-mixed.fss | wc -l
Historically the step was always 3.
I found, over time, that increasing the step greatly to something like 128 could greatly reduce memory consumption and performance in many cases.
In the situation where a large number of small objects are allocated then this number like 128 becomes highly abusive.
The simple low allocation step will only allocate a single unit on the very first allocation.
If the next allocation is on an array that has a size greater than one and less than four (via the tiny define), then the step size is set to four during allocation.
If the next allocation is on an array that has a size greater than four and less than eight (via the small define), then the step size is set to eight during allocation.
If the next allocation is on an array that has a size greater than eight and less than sixty-four (via the large define), then the step size is set to sixty-four during allocation.
In all cases, if the request step is less than the calculated step, then the requested step is used.
For example, if the requested step is twelve, then after eight is allocation, then the next generated step size is twelve rather than sixty-four.
Using some test files, shows the following reduction:
- Old: ~8GB of RAM -> New: ~200MB of RAM.
- Old: ~500MB of RAM -> New: ~20MB of RAM.
Update the unit tests accordingly and fix any problems exposed.
Kevin Day [Sun, 4 Aug 2024 00:40:49 +0000 (19:40 -0500)]
Update: Optimize away the isdigit(), isalpha(), isalnum(), and isxdigit().
I did some research and learned that the "is*()" functions can greatly affect performance due to locale and other manners.
I originally used these to allow for well established optimization to take place.
Replace these with some mathematical operations that should increase performance.
This also means no function call on the stack.
This project is already function stack heavy by design and so reducing functions when easy is a great thing.
Start using literal characters rather than the standard strings for the UTF related functions.
In these cases the ASCII expectation is guaranteed.
The ability to override these is also not practical as the meaning should not change.
I have not looked at all of the "is*()" functions and I may address any remaining ones at a later time.
I potentially may also investigate mapping tables to further improve performance.
These math calculations can be used in a lot of the non-ASCII UTF ranges as well.
I opted to not do these just yet given that such work will take a large amount of time.
I have not done any performance analysis yet but I plan to do so.
Kevin Day [Sat, 3 Aug 2024 00:43:04 +0000 (19:43 -0500)]
Bugfix: The private_f_abstruses_delete_switch() and private_f_abstruses_destroy_switch() both need wrapping defines.
The previous commit a2e1999a3e5c02a980fcbe9977b059c4639ea741 has a wrong define wrapper added.
The _di_f_abstruses_delete_ was added when it should instead be _di_f_abstruse_map_delete_.
The private_f_abstruses_destroy_switch() is also overlooked by that commit.
This adds the _di_f_abstruse_map_destroy_ to the private_f_abstruses_destroy_switch().
Kevin Day [Thu, 1 Aug 2024 02:33:32 +0000 (21:33 -0500)]
Bugfix: Add stage setting to standards to prevent build state file conflicts.
The build state stage files are conflicting in some cases.
The old solution to this problem has been observed as insufficient.
The different build settings might have the same exact file name.
I have tossed around the idea of a settings Object such as "stage" in the past but I had previously opted against it.
I now believe that skipping over this was a mistake.
Add a new feature to the standards to fix this bug.
The "stage" value may now be specified.
The fakefile files accept the "stage" setting and passes the result along to any build settings.
The settings files accept the "stage" setting and uses the value by appending it to the stage files.
Only a single value is supported.
The forward and backward slashes are explicitly prohibited.
Other special characters are recommended to be avoided given the possibility of local file system problems.
Rather than erroring out, this slashes are stripped out.
The bootstrap.sh script is updated to support this.
The support for "stage" in the boostrap.sh script is very limited.
Kevin Day [Wed, 31 Jul 2024 02:45:20 +0000 (21:45 -0500)]
Refactor: Change bit-wise enumerations into defines.
I did some reviewing of how the enumerations used for flags are used.
These generally are not being used as a type.
An enumeration slightly increases the resulting binary size.
Enumeration values might be limited to just type of int.
This seems like an easy (small) optimization to just use defines rather than enumerations for flags and other bit-wise numbers.
Kevin Day [Sun, 28 Jul 2024 22:18:07 +0000 (17:18 -0500)]
Update: The format sentence end strings, making their usage more clear.
The sentence ends, unlike the other similar global static strings, does not have the "_single" in the name.
Add the "_single" in the name and then for consistency addthe case where there should be no "_single".
These cases are as a result now handled:
- ".%r".
- "'.%r".
- "%[.%]%r"
- "%['.%]%r"
- "%[%[.%]%]%r"
- "%[%['.%]%]%r"
Kevin Day [Sun, 7 Jul 2024 03:05:54 +0000 (22:05 -0500)]
Bugfix: The f_memory array append and append all need to allow for sources to be NULL.
A valid array that is not allocated will have a size of 0.
Passing these to the function should not result in an error.
If the size is 0, then there is nothing to copy even though array is NULL
This is all fine.
Update the documentation comments to be more explicit on NULL in the parameters.
Kevin Day [Fri, 14 Jun 2024 02:52:11 +0000 (21:52 -0500)]
Security: Console parameter single short values array is too small.
The short parameters "needs" variable now increases the array size before assignment.
The following command line calls are used to expose the problem and its resolution:
# fss_basic_list_read specifications/fss.txt +Q -cn "Featureless Settings Specifications" | iki_read +Q -w -rrrrrrrr anti-KISS 'anti-<abbr title="Keep It Simple Stupid">KISS</abbr>' ASCII '<abbr title="American Standard Code for Information Interchange">ASCII</abbr>' BOM '<abbr title="Byte Order Mark">BOM</abbr>' FSS '<abbr title="Featureless Settings Specifications">FSS</abbr>' KISS '<abbr title="Keep It Simple Stupid">KISS</abbr>' UTF-8 '<abbr title="Unicode Transformation Format 8-bit">UTF-8</abbr>' URL '<abbr title="Byte Order Mark">URL</abbr>' XML '<abbr title="Extensible Markup Language">XML</abbr>' -WWW character '<code class="code">' "</code>" code '<code class="code">' '</code>' italic '<em class="em">' '</em>'
Kevin Day [Tue, 11 Jun 2024 00:12:18 +0000 (19:12 -0500)]
Bugfix: The fl_directory_create() needs to also handle F_file_found_not.
Creating an entire directory tree is not working as expected when creating non-existent directories that are two levels or greater deep.
For example take "a/b/c", if "a" exists but neither "a/b" nor "a/b/c" then the create fails.
For example take "a/b", if "a exists but not "a/b" then the create succeeds (or appears to because I never noticed the bug before).
The ENOENT (aka: F_file_found_not) is sometimes returned rather than ENOTDIR (aka: F_false) from f_directory_exists().
Process the ENOENT F_file_found_not.
I noticed some problems in the logic of the fl_directory_create() function as well.
The memcpy() needs to start from the same offset as the source copy offset.
Otherwise, the copy is overwriting the string.
Make sure to place the NULL at the "at_path" rather at "at_path - at_tree".
The initial assignment of "tree.used" is not necessary.
Kevin Day [Mon, 10 Jun 2024 00:00:24 +0000 (19:00 -0500)]
Update: Add all of the fl_print_format() replacement sequences as a static string.
I probably should create a single and double context for every sequence as well.
That is rather time consuming so I will do this some time in the future.
Kevin Day [Thu, 6 Jun 2024 01:11:37 +0000 (20:11 -0500)]
Feature: Add "Magic Bit" to the FSS-000F (Simple Packet) format.
Make the FSS-000F (Simple Packet) format more generalized and flexible by allowing other payload formats than only formally supporting FSS-000E (Payload).
This adds a new optional "Magic Block" that is designated via the "Magic Bit", which is the third bit from the left.
This should make it easy to store the Simple Packet as a local file.
This should make it easier for routing to optimize processing of the packet by quickly identifying the packet.
The "Control Block" and the "Size Block" have static sizes and positions, which should make it easy to identify the "Magic Block".
The third bit should be checked and then the "Magic Block" should be checked when trying to quickly identify the packet type via the "Magic Block".
Kevin Day [Sun, 2 Jun 2024 15:44:32 +0000 (10:44 -0500)]
Cleanup: Add newline before NULL comment and add dash to de-allocate.
The "Must not be NULL." documentation comments are not consistently structured.
I don't remember which decision I made and so now I am just forcibly setting the same structure with a new line before it.
Kevin Day [Thu, 23 May 2024 03:23:23 +0000 (22:23 -0500)]
Update: Remove the ++first and ++last parameters and relating logic.
I have used this for a while and have decided these are not worth the effort.
The addition is very nice but the additional code and logic is just extra maintenance and complexity for very little gain.
Kevin Day [Sat, 20 Apr 2024 04:29:22 +0000 (23:29 -0500)]
Update: Add additional time types, refactor f_time_spec and similar, and rebuild stand alone build configs.
The f_time_spec_t is not the same as "struct timespec".
Avoid confusion by renaming it to f_time_simple_t.
rename f_date_spec_t to f_date_simple_t for the same reason.
Add additional types and now that f_time_spec_t is available, create f_time_spec_t as a typedef of "struct timespec".
Update the stand alone build scripts with all of these changes and some changes from previous commits.
Kevin Day [Mon, 15 Apr 2024 04:02:08 +0000 (23:02 -0500)]
Update: Add experimental ctags generation and ctags file.
This is used by projects like geany.
Unfortunately, the code is terrible and the documentation is like rotten eggs.
They seem to somehow put hard-coded paths in the ctag files which makes absolute no sense.
Then, the geany project provides completely different ctag files that do not have this path nonsense.
The geany documentation does not relate to their actual ctag files and the ones provided by their example.
The Universal-ctags documentation, while having a lot of words, is misleading, awkward, and doesn't even describe how to get rid of these paths nor how to omit the paths.
Following the parts that does seem to read as if it means removing the path does absolutely not this.
The geany does not even import this file properly, despite the command coming directly from geany's documentation.
Using geany to generate this produces better results but also includes a lot of other junk that is unwanted.
It also includes the file paths.
Having the file paths makes these generated ctag files completely useless as it requires some other person to have the exact same absolute file path structure.
For now, attempt to strip out the absolute path using a sed command.
Add use of this in the unit test.
Remove seemingly duplicate unit tests.
Maybe I was trying to do a case of "bind()" returning "false", but the code does not do that in the duplicate unit tests.
Remove the duplicate unit tests as I can always add the "false" case in the future if I so choose to.