Kevin Day [Sun, 29 May 2022 03:04:46 +0000 (22:04 -0500)]
Bugfix: Fix problems exposed by unit tests in f_print.
Swap the length a string checks, the length has priority over string.
When performing the character safely prints, the F_utf should only be returned for UTF-8 characters whose width is greater than 1.
The ASCII characters are now no longer returning F_utf.
Some functions are missing the clearerr_unlocked() and ferror_unlocked() calls that are needed for proper fwrite_unlocked() error checks.
Update the documentation comments, adding missing information.
Some of the *_to* functions are not checking if the counter "i" exceeds the length before checking for NULL.
The *_to* functions for *_raw_safely* do not exist and should to be consistent with the regular print functions.
Add the missing *_to*_raw_safely* functions.
Several of the *_to_except* functions are missing the offset parameter which should be passed for consistency with the regular print functions.
Random functions are missing the final print that should exist outside of the loop.
The "total" needs to be checked and if it represents that unprinted data is present, then print that data.
One problem is a copy & paste mistake in the declaration of fake_make_parameter_variable_option_build_s where the wrong define is used.
Make sure to conditionally allocate the arguments array before operating on the "used" position.
The used_content variable should not be needed because the arguments.used should be 0.
After conditional allocation, ensure that the arguments.used is 0 before operation and remove the no longer needed used_content.
When the reserved IKI variables that represent program parameters are used and exist in isolation for their argument, then they should expand as separate variables.
Consider these four examples:
print 1 parameter:"build"
print 2 parameter:"build".
print 3 "parameter:"build""
print 4 "parameter:"build\" between parameter:"build""
Lets say fake is called with the following "fake make -b /tmp/".
The "print 1" example would have the following parameters:
1) 1
2) -b
3) /tmp/
The "print 2" example would have the following parameters:
1) 2
2) -b /tmp/.
The "print 3" example would have the following parameters:
1) 3
2) -b /tmp/
The "print 4" example would have the following parameters:
1) 3
2) -b /tmp/ between -b /tmp/
The "print 1" expands into 3 parameters because the IKI variable is by itself for that given argument.
The "print 2" expands into 2 parameters because the IKI variable is not by itself for the given argument (It has a period '.' at the end).
The "print 3" expands into 2 parameters because it is quoted and is treated as a single argument.
The "print 4" expands into 2 parameters because it is quoted and is treated as a single argument and the "between" should still be between the two substitutions.
A break is added at the end of one of the loops because that part of the loop is only reached after a match.
When a match is identified, the loop no longer needs further iterations.
Kevin Day [Wed, 25 May 2022 03:15:41 +0000 (22:15 -0500)]
Update: Additional test settings and add initial coverage support.
The goal of the coverage is to support gcov.
I am very unfamiliar with gcov at this time.
The settings are experimental at best.
There will likely be numerous changes relating to gcov in the future as I learn the tool and devised a process to build coverage reports.
Kevin Day [Wed, 25 May 2022 02:11:44 +0000 (21:11 -0500)]
Update: The fake "build" fakefile Object needs to support modes and fix a related bug.
The fakefile needs to be able to support being passed custom modes.
The default behavior is preserved with this change.
This requires supporting that an empty string is passed as Content.
It turns out that the quoted empty string is incorrectly being skipped.
Make sure the quoted empty strings are not skipped.
Kevin Day [Tue, 24 May 2022 04:31:37 +0000 (23:31 -0500)]
Cleanup: Existing unit tests, adjust status check behavior.
I was lazy with the previous behavior and always cleared the error bits when performing the comparison checks.
Change the behavior to properly check the status code for when the error bit is expected and when it is not.
Kevin Day [Mon, 23 May 2022 02:39:27 +0000 (21:39 -0500)]
Bugfix: Combining and Width detection for utf8 are not properly printing.
The wrong data is being passed to utf8_print_combining_or_width().
Change the behavior to send the correct string to the function.
Move the error printing to a single function and use this function in all such cases.
Kevin Day [Mon, 23 May 2022 02:07:07 +0000 (21:07 -0500)]
Cleanup: Rename bytecode to bytesequence.
The term "bytecode" already exists and is used for a slighty different purpoe (representing compiled or partially compiled data).
This is a different context.
To avoid using the term improperly, switch to a more proper term "bytesequence" (as one word).
A byte sequence is a term representing a sequence of bytes.
This is more specific than binary and effectively emphasis that this is in regards to bytes.
Avoiding the term binary, however correct or not the term may be, helps avoid confusion due to "binary" and "text" data being considered two separate things.
Kevin Day [Mon, 23 May 2022 01:51:29 +0000 (20:51 -0500)]
Cleanup: Use macro_f_string_static_t_initialize2() with macro_f_string_static_t_initialize().
Using macro_f_string_static_t_initialize2() was the old way and is now deemed incorrect.
The macro_f_string_static_t_initialize2() applies the length parameter to both the used and size.
For static strings the size is always 0 because nothing is dynamically allocated.
Therefore, using macro_f_string_static_t_initialize2() for static string initialization is incorrect.
Kevin Day [Mon, 23 May 2022 01:47:53 +0000 (20:47 -0500)]
Update: Tweak endiannes for f_utf_char_t.
I continue to forget that the f_utf_char_t is a big-endian format regardless of the host endiannes.
I then end up comparing the endiannes logic to normal operations and find discrepancies.
I waste a good bit of time to ultimately realize that the f_utf_char_t is not in host byte order.
Update the comments to better represent this situation.
I also noticed that the big endian bitwise operations are going in the wrong directory.
I could be wrong, but I think I need to do a left shift rather than a right shift.
Or perhaps, this only needs to be done on a big-endian system?
I need to test this logic on a big endian system.
Kevin Day [Mon, 23 May 2022 01:45:31 +0000 (20:45 -0500)]
Feature: Add missing functionality allowing the utf8 program to convert back to binary data with invalid codepoints.
Even when there are invalid codepoints produced, it should be possible to convert the entire output back to the original data.
This is possible because the codepoint output by default still prints the invalid data as a hex-digit representing up to 4 bytes of data.
The combining and width parameters are also supported.
Kevin Day [Sun, 22 May 2022 03:06:56 +0000 (22:06 -0500)]
Update: Use F_utf_not instead of F_utf and other resulting changes.
The F_utf_not is semantically more correct than F_utf when returning an error for an invalid UTF-8 sequence.
Use F_utf_fragment where appropriate as well.
Updating all of the appropriate comments revealed some documentation and code structure problems in the fss projects.
These are cleaned up as well.
Kevin Day [Sat, 21 May 2022 21:22:13 +0000 (16:22 -0500)]
Update: Project f_utf.
While investigating the utf8 program, I looked into the f_utf project and found that it is still very much lacking.
At some point in the process of me writing this, the Unicode 14 was released.
I started the process of updating parts of the code and have made it as far as Gujarati with this commit.
Remove unused functions.
Add new functions for detecting if something is a superscript or a subscript.
Update the comments in the private functions to make it explicitly clear when a particular private function expects that only characters of width 2 or greater are provided.
There are some "todo" comments that need to be addressed before the stable release.
I'm expecting another release candidate at this point and so I am pushing off some of the Unicode updates onto after the next release candidate.
I noticed that the unit tests for f_utf only address the structures.
While this is disappointing it does save me the effort of having to write more unit tests for the newly added functions.
Kevin Day [Fri, 20 May 2022 04:50:36 +0000 (23:50 -0500)]
Cleanup: Stale code, improve usage, replace macros.
The primary focus of this commit is to remove stale code exposed by compilers.
This pass I used clang with -Wall.
Fixed some usage cases where the variables can be replaced with other variables.
I happened to notice some macros didn't need to exist and added the appropriate methods.
I did not search for other cases like this.
I only fixed what happened to be in front of me at the time.
Kevin Day [Fri, 20 May 2022 04:07:28 +0000 (23:07 -0500)]
Security: Add missing NULL at the end of string.
I was seeing "-Wno-missing-bracess" and "-Wno-logical-op-parenthese".
This looks exactly like invalid sizes.
I could not find the code that misplaces the NULL but in the process I found a different place where a NULL is in fact missing.
This adds the missing NULL.
I guess if one looks hard enough, then they will find what they are looking for.
It turns out that my original problem is actually two typos in a configuration file.
Kevin Day [Thu, 19 May 2022 05:56:28 +0000 (00:56 -0500)]
Update: Rewrite the f_serialize functions.
The code is outdated and needs to be updated with the current practices.
I remember being uncertain on what to name several of these functions.
I didn't want an f_unserialize function because that would require a separate project (even if the naming makes sense).
Perhaps I may do that in the future, but for now just use the words "from" and "to".
Kevin Day [Wed, 18 May 2022 02:48:50 +0000 (21:48 -0500)]
Security: The realpath() calls malloc() and free() is not called (memory leak).
I change the code and didn't realize that realpath() conditionally calls malloc().
When I change th code to pass a variable initialized to 0, I ended up triggering realpath() to call malloc().
This results in a memory leak.
Kevin Day [Tue, 17 May 2022 02:47:06 +0000 (21:47 -0500)]
Update: Specifications.
Started using IKI format in the specification files.
Made changes and performed fixes as I noticed them while copying the specifications to the website.
Kevin Day [Mon, 16 May 2022 00:25:15 +0000 (19:25 -0500)]
Bugfix: The iki_read program is not handling verbosity correctly.
The newline should be printed at the end of the program unless in quiet mode.
This fails for two reasons:
1) The conditional checks before printing are wrong.
2) The quiet parameter is at the wrong position resulting in it being mixed up with the "no color" parameter.
Kevin Day [Sun, 15 May 2022 23:49:47 +0000 (18:49 -0500)]
Feature: The iki_read program should support wrapping a variable value.
One of the original design intentions of the IKI standard is to allow for substitution.
That substitution includes wrapping text with something like HTML markup.
The current design of iki_read falls short here.
While the substitution can be performed, the wrapping while preserving the existing value is not performed.
For example consider the following:
emphasis:"Some message."
This should be substituted with the HTML5 "<em>" tag.
The substitute parameter requires knowing the value.
The replace parameter also requires knowing the value.
The emphasis HTML5 markup needs to be prepended and appended without having to know every single value.
To solve this, the -W/--wrap option is now available.
This is a 3 parameter option that acts similar to the -r/--replace parameter.
However, it will instead accept a "before" and "after" representing the before and after strings.
Either the before or after string may be an empty string.
The design of this feature re-utilizes existing structures.
These structures have context in their names that do not match "before" and "after".
This can be confusing, but this is considered an inconvenince at this time.
The goal is to keep the changes simple if at all possible with a stable release around the corner.
I also do now know what words to use to share between the different types without creating a new one to make such a change.
This feature is necessesary to ensure completeness with the original intent and design of both the IKI standard and the iki_read program.
Kevin Day [Sun, 15 May 2022 21:31:11 +0000 (16:31 -0500)]
Feature: The iki_read program should support a more generalized substitution process called "replace".
In the distant past I mixed up having only two or three substitution parameters.
After getting confused, I decided to just have a 3 argument substitution.
The three argument substitution only substitutes if both the variable name and the variable value match.
This is great but it doesn't follow the completeness theorem.
The iki_read should also handle the general case in addition to the specific case.
Provide a two argument substitution called "replace" that handles the more general case.
When any variable name matches the given replace parameter, then the variable value is replaced for all matching variable names regardless of the existing variable value.
Fix small mistakes in the IKI specification.
I needed to refer to the "variable value" with this change and "variable variable" is simply not the correct way.
Use "variable names" rather than "object names" to be more consistent and clear in this regard.
Kevin Day [Sun, 15 May 2022 16:06:46 +0000 (11:06 -0500)]
Update: Add cmocka specific environment variables to the testfiles.
These environment variables are commented out by default.
The "define" "settings" Object defines the environment variable.
The "environment" "settings" Object exposes that environment variable to any called programs.
Kevin Day [Sun, 15 May 2022 00:57:19 +0000 (19:57 -0500)]
Update: The f_limit project is out of date.
Writing unit tests revealed that the f_limit project does not operate like the latest code.
Restructure and rewrite f_limit to be consistent with the latest practices and designs in the rest of the project.
Kevin Day [Sat, 14 May 2022 19:42:46 +0000 (14:42 -0500)]
Update: Add new status codes and fix problems with existing ones.
The API will be frozen on the stable release.
There are upcoming changes in the next development cycle that will focus on networking.
Provide additional status codes that will be used in networking to make transition and compatibility simpler and easier.
There are also plans in adding init support to the controller program.
Operations such as halt and terminate become necessary.
While working on this I noticed this introduces a discrepancy between "terminate" and "terminated".
The "terminated" is meant to focus on buffers, such as a terminated string.
To fix this conflict, I decided to favor the practice of trying to always use present tense.
This means replacing "terminated" with a present tense word.
I chose "end".
There already is an F_end, so break out a new status section and move all of the newly minted "end" types into that.
While making these changes I noticed and fixed a few problems.
There is both F_warn and F_warning.
Remove F_warn in favor of F_warning.
The F_string_too_large and F_string_too_small checks are incorrectly returning F_too_large_s and F_too_small_s, respectively, when they instead should be returning F_string_too_large_s and F_string_too_small_s.
Kevin Day [Fri, 13 May 2022 00:11:24 +0000 (19:11 -0500)]
Security: Add missing parameter checks and rename "data" to "custom".
Using "data" as the variable name for the "custom" property is confusing and can lead to mistakes.
Use "custom" to directly match that this is the "custom" property rather than the "data" property.
Kevin Day [Thu, 12 May 2022 02:49:04 +0000 (21:49 -0500)]
Bugfix: The fake program should not require the data directory when explicit fakefile or settings files are specified.
Set or reset the validate_parameter_directories check as appropriate when calling 'clean' or 'skeleton' operations.
Make the parameters_required check contingent on the presence of the parameters --fakefile and --settings.
When these are specified, do not even bother checking for the data directory at all.
Kevin Day [Thu, 12 May 2022 02:42:44 +0000 (21:42 -0500)]
Update: Add missing function f_path_is_absolute() and fix existing f_path_is_*() functions.
The f_path_is_absolute() function, being the compliment of f_path_is_relative() is now added.
I noticed multiple problems when looking at this code.
- The f_path_is_relative() and f_path_is_relative_current() functions are not checking that the max length is reached before comparing.
- the f_path_is_relative_current() is not incrementing the counter when attempting to check for the next character resulting in invalid results.
The f_path project clearly needs unit testing.
I intend to write unit tests and fix problems found before the next stable release is made.
Kevin Day [Wed, 11 May 2022 05:38:37 +0000 (00:38 -0500)]
Update: Utilize the state.flag to allow for fss read to not fail out on invalid UTF-8 code sequence and fix naming problems.
One of the original goals of the FLL project is to achieve fail-through functionality.
Knowing that this is a lot of work, I have ignored a lot of situations where I can implement fail-through and simply performed fail-out or fail-over.
With the upcoming stable release, I believe that this must handle bad data files.
This adds the option to conditionally change the behavior between fail-through and fail-out for the fss read functions and related for invalid UTF-8 code sequences.
The default behavior is now changed from fail-out to fail-through.
This took longer than I hoped.
I will need to do additional reviewing of this code before the stable release is ready.
I also realized that I need to support raw printing of data in the fss read functions as well (and that means changing the existing -r/--raw parameter).
This also fixes the following naming problems:
- fl_fss_apply_delimit() should be f_fss_apply_delimit().
- fl_fss_apply_delimit_between() should be fl_fss_apply_delimit_between().
Kevin Day [Wed, 11 May 2022 03:19:54 +0000 (22:19 -0500)]
Update: Utilize the state.flag to allow for iki read to not fail out on invalid UTF-8 code sequence.
One of the original goals of the FLL project is to achieve fail-through functionality.
Knowing that this is a lot of work, I have ignored a lot of situations where I can implement fail-through and simply performed fail-out or fail-over.
With the upcoming stable release, I believe that this must handle bad data files.
This adds the option to conditionally change the behavior between fail-through and fail-out for the f_iki_read() and related for invalid UTF-8 code sequences.
The default behavior is now changed from fail-out to fail-through.
Kevin Day [Tue, 10 May 2022 03:57:17 +0000 (22:57 -0500)]
Update: Add flags to the f_state_t.
Set the flag size to 32-bit as 16-bits is often small for bitwise flags.
I try to keep structures like f_state_t as minimal as possible.
However, I feel that I need to pass information to functions to allow for more flexibility.
I have mixed opinions on this as this encroaches on the Keep It Simple concepts.
However, after consideration, I believe some of this complexity is necessary for the upcoming stable release.
Future development branches will be free to change this as the project exposes the good and the bad of such a decision.
Kevin Day [Tue, 10 May 2022 02:45:09 +0000 (21:45 -0500)]
Bugfix: NULL is a valid character, causing utf8 not to properly print NULL characters.
The function f_utf_unicode_from() is incorrectly treating f_utf_char_t as a string (or a pointer).
The f_utf_char_t is a 32-bit integer.
The !0 check is therefore incorrect.
Kevin Day [Tue, 10 May 2022 01:55:52 +0000 (20:55 -0500)]
Update: Remove unused code, cast (char) to (unsigned int) for array indexes, and fix bitwise problem.
A bit of stale code is exposed by running the compiler with -Wall.
Example:
fake clean build -d -Wall
fake clean build -d -Wall -m clang
Using char (generally) is fine because the numbers match.
However, there tends to be specific cases and behaviors that might result in char being not treated as expected.
Explicitly cast to an (unsigned int) to play it safe.
The fwrite_unlocked() response checks were previously mass refactored to use a size check on the response.
Mistakes in this resulted in the the not operation "!" being left there resulting in a bad if condition check.
Clang warnings about not having parenthesis when using "&&" and "||".
I would argue that this is simply an ignorance or incompetence in the programmers.
The programmers should be expected to understand basic parts of a language, such as order of operations.
Rather than fight this battle, I am just adding parenthesis.
Kevin Day [Thu, 5 May 2022 05:03:04 +0000 (00:03 -0500)]
Cleanup: Fix typo in the word 'whitespace' and break the word 'whitespace' into two words.
I noticed a typoe 'whitspace'.
Add the missing 'e'.
When printing to the user or documenting in comments use the standard two word form of 'whitespace'.
The programming specific variant of that as a single word will remain in use for programs.
Kevin Day [Thu, 5 May 2022 04:09:27 +0000 (23:09 -0500)]
Cleanup: More confusing messages due to a bad refactor.
At this point it has become clear that there was a refactor in the past that incorrectly replaced some of the words with "file".
This made nonsense messages.
These are to be fixed as I noticed them.
Kevin Day [Thu, 5 May 2022 03:26:39 +0000 (22:26 -0500)]
Update: Have iki_write use form-feed rather than end of line character for pipe input.
The IKI specification allows for just about any character inside the content, including newlines.
Given that newlines are far more common than form-feed characters, switch to form-feed.
The form-feed character is chosen because there is a standard escape sequence that can easily be passed to commands like echo.
For example: echo -en "a\fb" | iki_write
Kevin Day [Thu, 5 May 2022 03:08:12 +0000 (22:08 -0500)]
Regression: The iki_read is not processing anything.
This is a mistake in the commit b1dddea0ecf4aecfe0c7965b1b40b2432ce47b8a.
The size_file variable was created but file.size_read was not replaced in the call:
status = f_file_size_by_id(file.id, &file.size_read);
Kevin Day [Wed, 4 May 2022 02:16:39 +0000 (21:16 -0500)]
Cleanup: Rename 'binary' to 'bytecode' in UTF8 program.
The use of the term "binary" here is both valid and invalid.
The UTF-8 is considered text and so this is better called text.
Another name for this is "bytecode".
Given that these both have "b" (for partially preserving the parameters) and "bytecode" is a bit more specific than text, use "bytecode".
Kevin Day [Mon, 2 May 2022 01:24:30 +0000 (20:24 -0500)]
Update: Add licenses to text files and clarify OSL text.
Minor modifications to the text in the open-standard-license-1.0.
One notable change is changing "application" to "applying" because "application" can be mis-interpreted as a program (which are also called applications or app for short).
In this context the word "application" is meant to mean "applying".
Just change the word to "applying" to avoid this potential confusion.
Add more text to the protocol terminology to declare the context is in regards to computers and source code.
Add the cc-by-sa-4.0 license file.
They do not provide a downloadable copy of the license, so for now just add links.
I need to come back and fix this once I get a downloadable text file that I can legally store in the source code and transfer to others.
Kevin Day [Thu, 28 Apr 2022 03:05:15 +0000 (22:05 -0500)]
Cleanup: Controller program return codes should be more generalized.
It turns out that when agetty returns on access denied while trying to login, it returns access denied to the controller program.
The controller program has no way of distinguishing that this is access denied while trying execute the program to this is access denied because the program returned access denied.
Change the error messages to be more generalized so that they are less misleading.
Kevin Day [Wed, 27 Apr 2022 05:08:32 +0000 (00:08 -0500)]
Bugfix: When compiled as "init" the controller program does not use the correct paths.
The isolation between the "init" specific changes and the normal "controller" specific code is insufficient.
Move all of the special paths into the main program, introducing a new header and source file called "main-common.h" and "main-common.c".
The main program is now responsible for providing these strings.
Kevin Day [Sat, 23 Apr 2022 06:11:37 +0000 (01:11 -0500)]
Update: Implement "github" test system in the testing script.
The github actions has a repository that lacks cmocka.
Using the apt-get to download the systems cmocka library is slow and a waste of time.
Utilize the support for a custom "github" test system and manually download, compile, and install the cmocka source.
Given that this is for github, utilize a cmocka mirror repository that I found on github.
This is not ideal because it pulls from master rather than a specific version but it should work well enough.
Make any other appropriate changes or improvements to the testing script.