Kevin Day [Thu, 6 May 2021 01:09:43 +0000 (20:09 -0500)]
Bugfix: FSS Basic Read and FSS Basic List Read problems and cleanups.
The delimit is not being calculated correctly.
The fss_basic_read_load() and fss_basic_list_read_load functions are out of place.
The parameter order for some functions like fss_basic_read_print_at() are not adhering to the ordering practices (constants on the left).
The total is not consistently being counted.
The FSS Basic Read is not taking into consideration when Content is empty and --object is or is not selected for some line specific processing.
When there is only --content, then whether or not Content is empty matters.
When there is --object (or both --object and --content), then whether or not Content is empty does not matter because Object is already taking up a given line.
Kevin Day [Wed, 5 May 2021 05:24:12 +0000 (00:24 -0500)]
Cleanup: Use number instead of word in FSS Basic Read help.
While committing the FSS Basic List changes to be in sync with this, I noticed that I had "..start at 0 instead of one..".
This is inconsistent.
I either need to use both words ("zero" and "one") or both numbers ("0" and "1").
I opted to use the numbers.
Kevin Day [Wed, 5 May 2021 05:13:35 +0000 (00:13 -0500)]
Update: Improvements and tweaks in FSS Basic Read.
Add additional help information.
Cleanup comments.
In some cases the total printing is inverted by accident.
The print_object function pointer doesn't really need to exist anymore.
Add missing print for when both total and line parameters are specified.
Kevin Day [Wed, 5 May 2021 05:00:40 +0000 (00:00 -0500)]
Bugfix: UTF-8 characters buffer is incorrectly returning an error.
The previous commit: "Bugfix: UTF-8 characters are not being fully printed" exposed that for UTF-8 characters (width 2 or greater), an error is always returned.
When the width properly fits in the requested range, return the appropriate success code instead of an error.
The previous implementation is weak in that there is no good way to just delimit Object or just delimit Content.
Redesign to allow for specifying the delimit parameter multiple times and therefore allow for customizing what to specify.
Rename "depth" to "content" in the delimit enum to better communicate that this is for "content" delimiting.
Examples:
- "fss_basic_read --delimit object": Results in delimited Objects but not Content.
- "fss_basic_read --delimit 0+": Results in delimited Content (position 0 and greater) but not Objects.
- "fss_basic_read --delimit object --delimit 1-": Results in delimited Objects and delimit Content (position 1 or less).
For this standard, there is no delimit support in Content so the use of the numeric range is superfluous.
Having this functionality, however, makes it consistent with the rest of the FSS Read programs.
Kevin Day [Mon, 3 May 2021 02:51:54 +0000 (21:51 -0500)]
Regression: FSS Basic read --select is always returning empty sting or 0.
After changing the code structure, the check to see if the select number is non-zero was lost.
As a result the code is always operating as if the select number is non-zero.
When the select number is zero, all existing operations should continue.
I seem to have forgotten to wrap these macro checks in parenthesis.
As a result something like "!macro_f_file_type_is_block()" would expand to "!macro_f_file_type_get(mode) == f_file_type_block".
What it should expand to should be logically equivalent to "macro_f_file_type_get(mode) != f_file_type_block".
The expansion with the parenthesis would be: "!(macro_f_file_type_get(mode) == f_file_type_block)" and that is indeed logically equivalent.
Kevin Day [Mon, 3 May 2021 02:10:40 +0000 (21:10 -0500)]
Cleanup: Disable parenthesis warning in GCC.
This is another case where the compiler is overstepping itself.
The programmer should understand the language and the order of operations.
Disable the warning by passing -Wno-parentheses.
(The warning only appears if -Wall is given, but if -Wall is passed then -Wno-parentheses should be already in place.)
Kevin Day [Mon, 3 May 2021 02:07:21 +0000 (21:07 -0500)]
Cleanup: Sloppy use of "main" inside of "main()", oops.
I cannot believe I let this one slip through (and so did the compilers).
When I refactored "data" to be "main" this included the refactor in the function called "main".
This is dangerous at worst and at best bad practice.
Given that "data" is no longer to be used in the main(), just rename the uses of "main" back to "data" for the variable name only (not the typedef structure name).
Kevin Day [Mon, 3 May 2021 01:53:48 +0000 (20:53 -0500)]
Cleanup: FSS Basic Read parameter processing, file variable related, and some ++/--.
Simplify the parameter processing using an array to avoid repeating similar code.
Relocate the file variable so that it goes out of scope and is removed from the stack before processing.
The file variable is no longer needed during processing with the current design so don't hold it in memory after it is no longer needed.
Relocate the file stream close so that it doesn't need to be specified as many times in the code.
There are some ++/-- postfixes in use that would be better as prefixes (such as changing i++ to ++i).
Kevin Day [Sun, 2 May 2021 22:09:43 +0000 (17:09 -0500)]
Update: Implement data structure in FSS Read.
This is the designated follow up commit for resolving the need for a "data" structure.
The parameters are now extracted into a bitwise "option" property on the "data" structure.
The process is now broken up into multiple functions.
Kevin Day [Sun, 2 May 2021 05:12:42 +0000 (00:12 -0500)]
Update: Remove the "amount" from file stream functions.
The "amount" is present to support the parameters that fread() and fwrite() utilize.
This makes no sense to me and it is annoying and confusing.
I end up having to just put 1.
Get rid of it and just use the file.size_read and file.size_write to specify the buffer size to read/write.
The only things that I can thing of might be atomic operations, locking, and calling the function multiple times.
These are good reasons to have an "amount".
If I end up wanting o needing an "amount", I may add additional functions later on.
Kevin Day [Sat, 1 May 2021 22:31:49 +0000 (17:31 -0500)]
Cleanup: Always have private-common.h and private-common.c for programs.
I am now introducing a new standard practice of always having a private-common.h and private-common.c for programs.
The private data types shared across the program will be stored in these.
These will also provide any functions for allocating, deallocating, or otherwise managing those private structures.
This makes no effort to move over or implement any of the allocation/deallocation functions.
Individual programs will be updated on an as able basis to address this.
The Controller program already has this but it has a bit more than this practice.
The Controller program will see some structural cleanup in the future.
Kevin Day [Sat, 1 May 2021 22:00:27 +0000 (17:00 -0500)]
Refactor: Relocate 'macro' prefix in names for macros.
Placing 'macro' after the project name, such as 'fll_' or 'fake_', is a good idea for keeping the function names consistent and contained within the project naming structure.
For short names like 'f_' and 'f_string_t' this is not a problem.
For complex and usually longer names, such as 'fake_' and 'fake_main_t', this becomes confusing quickly,
I have decided to favor the less consistent macro as a prefix to the project name, to make the code a bit more readable.
For example: 'fake_main_macro_delete' would become 'macro_fake_main_delete".
Kevin Day [Sat, 1 May 2021 21:46:35 +0000 (16:46 -0500)]
Refactor: Use 'main' instead of 'data'.
The original goal of 'data' is to be used as the main store of data for the program.
A program using the programs as a library are also expected to get and use this structure.
The problem is that the programs are designed such that the caller to the program (as a library) should not have access to internal details.
Refactor 'data' to 'main'.
This is a more precise name in that it is the structure passed as if it were called from 'main(argc, argv)'.
This also frees up 'data' for internal use such that 'data' can now be the more generalized 'data' without exposing anything to 'main'.
The Controller program is already using 'main', so that 'main' was refactored to 'global'.
There are still more changes to do, such as restructuring the 'main' types to ensure nothing unwanted is exposed to a caller.
These additional changes, however, are beyond this scope of this commit.
Kevin Day [Sat, 1 May 2021 19:46:21 +0000 (14:46 -0500)]
Bugfix: Several of the parameters are not handling the desired set of possible combinations.
With recent changes, more parameters may be used together than before.
This exposed how several combinations simply did nothing or did not do what was expected.
Redesign and even simplify the code to allow these parameters to work together.
Some of the code is abstracted out into their own functions.
There is a goal to have a data structure for passing setting, but before that is done I want to make other significant changes FLL-wide.
For this reason, I am putting in place a temporary 'print_this' bitwise variable.
Kevin Day [Sat, 1 May 2021 16:23:56 +0000 (11:23 -0500)]
Update: Redesign Basic List loading logic to load all files into a single buffer.
Upon further use and review I believe that it is better to treat all input sources as a single buffer.
This allows for all of the parameters to work closer to what feels like normal logic.
If I want to get the total lines for all listed files, then I should get that.
If I want to get the total lines for each listed file, then I can call this program once for each file to get that.
I am working on Basic List first but this will be repeated for all of the other FSS read projects as well (likely in a single commit).
One of the downsides of this is that it exposes a current design limitation where the max buffer size is more likely to be reached.
Future work will most likely address this in some manner.
Kevin Day [Sat, 1 May 2021 03:46:02 +0000 (22:46 -0500)]
Regression: Incorrect char type resulted in SIGPIPE.
The uint8_t/int8_t was changed into char recently.
This change appears to have be incomplete for the Byte Dump program.
Update the code to be aware of PIPE by passing a NULL string to represent a PIPE instead of a file.
While fixing this, go ahead and replace read() with fgetc().
This is more efficient due to the use of a file stream.
The use of read() is originally done for testing some of the lower-level FLL design.
This testing is no longer necessary so it is worth switching to fgetc().
Future design may merit reading larger chunks than 1 character at a time.
The use of fseek() is now available and in use (for non-PIPEs).
Kevin Day [Wed, 28 Apr 2021 23:57:36 +0000 (18:57 -0500)]
Update: Replace static strings with extern defined in common file.
This makes the controller program more consistent with the FLL projects.
This should make it easier to use the controller program as a library as well.
Kevin Day [Wed, 28 Apr 2021 23:08:29 +0000 (18:08 -0500)]
Update: Fix problems and make changes after testing LLVM's Clang compiler.
Add flags "-Wno-logical-not-parentheses" "-Wno-logical-op-parentheses".
Programmers should be expected to understand the language they are working.
Having the compiler for a style is bad practice.
The cap_to_text() needs ssize_t, do not use f_array_length_t.
The project is inconsistently using int8_t and uint8_t for character types.
Furthermore, clang likes to complain about uint8_t being converted to char.
(I believe uint8_t is supposed to be of type unsigned char and char by default is supposed to be unsigned.)
Fix the inconsistency and just use char, which happens to make clang happy without any complaints from gcc.
Clang does a much better job at detecting some problems than GCC.
Compiling with clang resulted in revealing several printf related problems that now should be fixed.
The f_gcc_attribute_visibility_internal (and related) have the "_gcc" removed because these seem to exist beyond just gcc (such as with clang).
Make sure something is always returned.
There are some functions that didn't have a return.
One of the UTF processing functions has an accidental hex character in the condition.
Remove the extra character, which I am pretty sure is the leading "d".
I have not validated the correct sequence and so further investigation in the proper sequence for U+1D7CE to U+1D7D7 may be warranted.
The clang compiler claims that int main() should only be an integer for the argc.
This is unfortunate but that is fine, switch to use an int instead of an unsigned long.
Kevin Day [Tue, 27 Apr 2021 23:49:35 +0000 (18:49 -0500)]
Bugfix: PID and PID file should account for multiples during process execution.
I forgot all about needing to do this so I am considering this a bug.
Each process may execute multiple Actions.
Each Action has its own PID.
In the case of foreground (synchronous) execution, having only a single PID and PID file path on the Process structure is not a problem.
When with PID file (asynchronous) execution operates, multiple PIDs (and respective PID files) may exist for any single Process structure.
Kevin Day [Tue, 27 Apr 2021 04:09:09 +0000 (23:09 -0500)]
Regression: Entry error after failure or during validation is not propogating.
When an Entry fails and successfully executes a failsafe, the failure is not propagated.
The failsafe is meant to bail out and not continue onward, so after a successful or failed failsafe, return F_failure (with error bit as appropriate).
When passing --validate without --test, the program is not exiting as it should.
This is because the state is not being handled just like the status is not being handled.
When joining threads, be sure to reset the identifiers.
Remove now extra Exit processing block.
This is now handled fully by the cancellation function.
Restore the thread enabled state after operating failsafe.
Get rid of simulate variable, instead use the console parameter directly.
This saves memory by a trivial amount.
Kevin Day [Mon, 26 Apr 2021 23:45:23 +0000 (18:45 -0500)]
Update: Implement "execute" support and fix bugs.
Provide new feature for executing into another program.
This is provided via the new Item Action "execute".
The function controller_perform_ready() is being called and the status is being checked but the status is not being assigned.
There are a few cases where thread.enabled needs to be checked and a few cases it does not need to be checked.
Several of the threads need to be aware of the normal/other status to properly determine the thread.eneabled situation.
Without this, they make incorrect decisions that result in bugs.
I did not want to implement a new structure to resolve this so instead provide custom wrapper functions to call that set the appropriate normal/other state.
The functon controller_thread_process_cancel() now needs to be caller aware so that the caller does not get cancelled.
Kevin Day [Sun, 25 Apr 2021 17:13:54 +0000 (12:13 -0500)]
Update: Improve entry/exit verbose messages, display exit with -tv, and wait all with read lock fix.
The verbose messages should be distinguishing between an entry and an exit now.
When both --simulate and --test are specified, the printout should include both the entry and the exit.
Previously, only the entry was printed.
The wait all should only be triggered to wait for all processes at the current moment in time.
Furthermore, the read lock is being held too long (for the entire loop).
Maintain the read lock long enough to build a list of all processes to wait for.
Kevin Day [Fri, 23 Apr 2021 01:49:21 +0000 (20:49 -0500)]
Progress: redesign processing logic to accommodate different process Rule Actions and update dependency design accordingly.
When I implemented the "exit" support (opposite of an "entry") I noticed a oversight in the design whereas there was no way to distinguish between a process that successfully started via the "entry" or the "exit".
Change the design to now utilize unique Rule Processes for each Rule Action requested.
This required further changes to the status handling.
A rule status is now an array of all possible Rule Actions.
This is utilized using a static array for simplicity purposes (there is no need for a dynamic array here).
All of the recent changes introduced a lot more complex code.
There are now helper functions to help facility common tasks.
This should also make updating easier as there is only one place to update.
The downside is the introduction of an additional function call (which is a tiny runtime cost).
To facility this new design, the Rule files must also be aware of the different Rule Actions.
The "need", "want", and "wish" have been relocated into a new Rule Action called "on".
Additional parameters for an "on" allow for describing the Rule Action in which the dependency applies to.
This allows, for example, a "stop" Action to operate in a different order than a "start" Action.
An example of this is provided.
Look at the data/settings/example/rules/serial/*.rule files.
An example syntax is:
on start need serial s_1
on stop need serial s_3
When validate is passed, do not wait for asynchronous processes because they are not run.
Normally this is not noticeable but is exposed when 'script/fail' failed to execute and return ('script/fail' should execute, fail, and return).
This bug is caused by the wait functions not checking to see if the caller is the same as the current process (it was waiting for itself).
This may have been introduced as a result of the redesign.
Kevin Day [Wed, 21 Apr 2021 22:12:00 +0000 (17:12 -0500)]
Progress: Redesign enty/exit rule handling, now requiring Action instead of "rule".
The Entry and Exit files are using "rule" to designate a Rule to operate.
This is designed on the assumption that an Entry always runs a "start" and Exit always runs a "stop".
This behavior is changed such that "rule" is no longer specified and one of the 9 supported Actions may be used.
Such as "start" or "stop", for example.
There is still more work to do as this change doesn't fix the Exit in terms of dependency handling.
Currently, the process structure does not distinguish the Rule action, such as "start" or "stop".
Additional changes to the process structure are needed.
Kevin Day [Tue, 20 Apr 2021 23:16:45 +0000 (18:16 -0500)]
Progress: controller program, add exit support.
This implements the initial work needed to get exit files working.
A new thread is added to handle both "entry" and "exit" to free up the "rule" thread and allow for the "exit" to start will the "rule" thread exists.
The thread enabled process is now more complex given that there needs to be stages so that "exit" threads can work while "entry" or "control" operations begin terminating.
Add several helper functions to help simplify the detection.
Add an "alert" lock to send/receive alerts.
This is a more general purpose lock that tells something waiting on the alert to wake up and see if anything you care about changed.
The file loading now needs to be aware of "optional" files, namely the exit files.
If the named exit file does not exist, then do not care as it is not required (unlike entry and rule files).
Many of the read and write locks are now timed so that they can check to see if the controller is enabled or not.
While working on this I noticed some oversights and things that I did not previously consider.
I need to follow this up with a redesign in the entry/exit rule handling.
Instead of "rule", I should have the rule actions like "start" and "stop".
There needs to be support for default behaviors, such as allow for "stop" to not have a parameter and to correctly find and terminate a service.
There needs to be consideration on dependencies when exiting, perhaps there needs to be an exit-specific dependency management.
Kevin Day [Sun, 18 Apr 2021 15:47:12 +0000 (10:47 -0500)]
Progress: exit prep work, reserve and implement "setting" in entries and exits, add option to "ready', and update documentation.
Add documentation, specifications, and basic structural changes for the "exit" files.
The "entry" and "exit" files now reserve "setting" for designating settings.
Currently, only "entry" supports a setting and that setting is "mode".
The "mode" setting designates how the entry program is intended to behave.
When operating as a "service", the controller program will wait indefinitely for commands (via the not yet implemented "control" program or "control" socket).
When operating as a "program", the controller program will immediately exit after completion.
The "ready" Entry Action now supports "wait".
When "wait" is provided, the "ready" operation will wait for all current asynchronous processes to complete before operating.
Update documentation and specifications, adding the "exit" files.
Cleanup the existing documentation and specifications, fixing wording.
Kevin Day [Sat, 17 Apr 2021 23:44:52 +0000 (18:44 -0500)]
Update: implement with pid execution, simplify related rules.
Implement the with pid execution.
This expects the process to be spawned in the background.
After some review, I decided to remove "use" and "create", replacing those with "pid_file".
The reasons are:
- For "use", the spawned service manages the pid file, so it would be overly complicated to try and manage it in addition to the spawned service.
- For "create", if the process is to go into the background, in order to manage it then there would still need to be a running process (this defeats the purpose).
When the termination signal is received, then inform any background process spawned by the controller program to exit, based on the existence of the pid file.
Kevin Day [Sat, 17 Apr 2021 19:44:21 +0000 (14:44 -0500)]
Update: fix file stream error return, improve file error messages, add new status types and remove status types.
The f_file_stream_close() function is missing some errors.
The file error message printer is missing several error messages.
There is also some cleanup is needed in the file error messages (consistency problems, mostly).
Add F_file_overflow, and F_file_underflow for file specific overflow and underflow.
Add missing F_file_descriptor_not.
Remove F_file_allocation and F_file_deallocation, only the generalized F_memory and F_memory_not are used now.
Kevin Day [Sat, 17 Apr 2021 16:36:17 +0000 (11:36 -0500)]
Cleanup: relocate fl_color code to f_color and remove fl_color.
Ever since f_string became a core/required/special project where all level_0 could depend on it, fl_color no longer needed to be at level_1.
Relocate fl_color into f_color, removing the fl_color project entirely.
Update all dependencies.
This exposed some missing dependencies in fll_program that fl_color is secretly handling.
Fix that as well.
Kevin Day [Sat, 17 Apr 2021 05:41:11 +0000 (00:41 -0500)]
Feature: controller program must support "with full_path".
This is necessary to accomodate fickle programs like SSHD where the full path in argument[0] is required.
This implements a general feature called "with" which is provided to add flags on a per Rule Type basis.
These flags will tweak the Rule Type is some manner.
Only one flag is suppoted at this time: "full_path".
The "full_path" provides the necessary functionality to make SSHD happy.
Kevin Day [Sat, 17 Apr 2021 05:34:34 +0000 (00:34 -0500)]
Update: redesign fl_execute_parameter_option_path in fll_execute.
The previous design of fl_execute_parameter_option_path seemed pointless because it could be detected if a slash is in the progam name.
Recent changes have utilized the slash in the path to do just that.
While working with the controller program the SSHD program revealed that some programs are fickle about what is in their argument[0].
SSHD wants the full path and as such it needs to be provided.
This makes sense as the normal behavior of most programs when started with a full path general expect a full path in argument[0].
The fl_execute_parameter_option_path is redesigned to instead provide a full path for program and for argument[0].
Kevin Day [Sat, 17 Apr 2021 04:00:00 +0000 (23:00 -0500)]
Bugfix: fll_execute does not execute full path.
When passed a full path to a program (rather than depending on detecting in from just a program name) the program does not execute.
There are several logic flaws and mistakes.
- The last_slash should have "+1" to avoid including the slash itself.
- When environment is cleared, it need to potentially use "program" or "arguments.array[0].string" if a slash already exists in the provided name.
- The final NULL at the end of the program_path string is missing.
- The "program" or "arguments.array[0].string" should be used in general instead of always the "program" only.
- The fixated_is is being used incorrectly in private_fll_execute_path_arguments_fixate (and the documentation is incorrect).
I suspect that these functions are messy due to changes in design that were not fully updated (including the appropriate documentation).
Kevin Day [Fri, 16 Apr 2021 23:48:04 +0000 (18:48 -0500)]
Bugfix: properly configure example.
The reason why controller is attempting to execute "/var/run/sshd.pid" is not because it is incorrectly attempting to process the "use" as a start.
Instead, it is because it is properly attempting to do so because it is being told to.
Oops! This is just a simple misconfiguration and not a bug in the code.
A bug in the configuration.
Kevin Day [Fri, 16 Apr 2021 04:20:22 +0000 (23:20 -0500)]
Progress: controller program, begin working on pid file related executions.
It occurred to me that I could quickly create a process with a PID file using a script.
I then realized that "service" only applies to binaries and not scripts.
Add a new type called "utility" that is identical to "service" in purpose except that it processes scripts.
Begin implementing the PID file related code.
It seems there are a few things to address, such as proper test output display.
For example, I am seeing "Simulating execution of '/var/run/sshd.pid' with the arguments: '' from 'SSH Service'.".
That "/var/run/sshd.pid" should instead be "/usr/sbin/sshd".
Creating or checking existence of the PID files is not yet written.
The behavior of after a process forks is not yet written either.
An error needs to be printed and F_failure (with error bit set) should be returned.
The "success = F_failure" should probably have error bit set.
Add new rules and entries for testing this (to be implemented) functionality.
Update the documentation.
Some of the documentation is outdated and as a result, wrong.
Kevin Day [Thu, 15 Apr 2021 02:52:47 +0000 (21:52 -0500)]
Update: implement read lock handlings support.
There are some cases where I am able to figure out a way to make the logic continue on on failure and handle the case of read lock failure.
There are other cases where I am not yet sure how to handle.
Expect more changes in the future to address ability to continue onward on lock failure.
Some functions now return F_lock (with error bit) to designate that the lock failed.
In these cases, it is for functions that require the caller to have a read lock (such as process->lock) locked before starting.
When such functions return the caller now has a way of knowing that the read lock is no longer held.
Kevin Day [Wed, 14 Apr 2021 22:44:13 +0000 (17:44 -0500)]
Update: handle write lock failures, begin adding support for handling read lock failures.
Get the response on write lock failure.
Present an error message.
Return the error status.
Increase the cancellation timeouts from 0.06 seconds to 0.6 seconds, making it less aggresive.
This results in a 90 second max timeout, which gives more problematic exists a lot more time to cleanly exit.
Begin adding support for getting and handing read lock failures.
Read locks attempts will be in a loop that checks main thread enabled as well.
The timeout is longer than write locks to reduce the CPU overhead as there will be a lot of read locks.
Follow up work will utilize this read lock status handling.
Remove the additional thread enabled check that follows a write lock.
The write lock already checks if the main thread is enabled.
Kevin Day [Wed, 14 Apr 2021 22:41:18 +0000 (17:41 -0500)]
Update: f_thread_lock_read_timed() should return F_resource_not.
The pthread_rwlock_timedrdlock() potentially returns EAGAIN.
This needs to be handled, translated into F_resource_not (with error bit), and then returned.
Kevin Day [Wed, 14 Apr 2021 17:08:27 +0000 (12:08 -0500)]
Update: handle asynchronous failures for failsafe, update locks, and related fixes.
The failsafe needs to be triggered when a required but asynchronous process fails.
I originally planned on implementing this via locks but I would rather avoid adding even more locks.
This approach instead provides a wait loop at the end of the entry waiting only on all required processes.
If any of these fail, then the wait will return the requirement failure.
This is a change to the wait all function and behavior, which is updating to now return statuses.
Get rid of process->status, it is no longer needed now that process->rule.status exists.
Having it remain is wasteful and confusing.
The entry processing is updated to be failsafe aware.
This now potentially operates the failsafe rule item.
The failsafe must not do a wait all like normal operation inside of the entry processing function.
There are some discrepancies between "process options" and "rule options".
Technically, there are no "options" on a rule as this is a concept introduced for/by entries.
The process uses these options but most of them were named "rule options".
I then later created a "process options" to add override so that even if the rule is requested asynchronous, if a dependant thread is requiring it, then it will instead run synchronously in the thread of the depending process.
This resulted in both "process options" and "rule options".
They really are the same, so instead remove the current "process options" and rename all "rule options" into "process options".
There are now no longer any "rule options".
I then renamed the "process options" variables to "force options".
Remove signal_all from delete process.
The signals should now all be timed and will exit when thread disabled is set.
The execution used to be designed with the intent that is where the asynchronous processing would handle.
After multiple iterations in design, this is no longer the case.
Update the code to always pass parameter option fl_execute_parameter_option_return instead of doing it only for asynchronous processes.
Add missing re-locks.
Some of the functions require that a certain (read) lock be held prior to calling the function.
When the function returns, the expected lock should still be held.
It so happens that this is not consistently the case.
Certain error or exit states are returning without re-establishing the expected (read) locks.
The controller_rule_find() returns a boolean and not a status with potential error bits.
Fix a block of code where it is checking for error bit on the return value of this function.
The controller_rule_read() also returns a boolean and is being handled as a status with a bit.
In this case, change it to return a status given that this make more sense in this particular case.
Remove relevant/related stale code and add missing comments.
Minor code cleanup:
- Removing no longer valid documentation.
- Cleaning up syntax, such as spacing.
- Updating documentation comments.
- Pass controller_main_t as a pointer in the entry function, updating all uses.
- Remove @fixme for considering status handling because I don't plan on bothering with this now.
A lot of the controller_print_unlock_flush() calls are using the incorrect stream.
Use status_lock as a return in cases where the status that is retuned shouldn't become lost when processing the locks.
Otherwise, the errors may never properly bubble to the parent and therefore may never trigger the failsafe.
The failsafe is not a rule id, but instead an item id.
If and when failsafe support is enabled, trigger it.
The currently implemented code only handles synchronous failures.
Begin work with adding a "failed" flag so that asynchronous, but required, failures can be detected for the purposes of failsafe execution.
Kevin Day [Mon, 12 Apr 2021 00:14:06 +0000 (19:14 -0500)]
Cleanup: remove "local" workaround in execute function.
At the time I was unable to determine the cause.
Now, I strongly suspect it was due to the way threads were being created on the parent stack and the child threads could not access them once a resize function happened.
The resize can, on occasion (and likely more often than not), relocate the address location of the range (array).
This relocation in memory likely caused the weird memory behavior.
Now that the memory is allocated such that the it is an array of allocated pointers, the memory addresses should not change when the array of pointers is resized.
The address to the pointers might change, but from the perspective of the child threads, they don't exist or otherwise matter anyway.
Removing the work around cleans up some clutter and what is essentially a waste of resources.
Kevin Day [Sun, 11 Apr 2021 23:16:16 +0000 (18:16 -0500)]
Bugfix: test mode (-t), should be simulating not executing.
The rule_options is setup but it seems that I forgot to actually pass it.
Set the controller_process_option_execute on the rule_options and then pass rule_options instead of directly passing controller_process_option_execute.
Kevin Day [Sun, 11 Apr 2021 23:03:29 +0000 (18:03 -0500)]
Bugfix: child process PID is not being passed properly and child process prints failure message on terminate signal.
The child process PID was not being assigned because the wrong variable was being passed.
The child processes should not present error messages on exit when termination signal is received.
Make sure that when the child process exists and the main thread is disabled, exit with F_signal.
Kevin Day [Sun, 11 Apr 2021 23:00:09 +0000 (18:00 -0500)]
Bugfix: main process doesn't always exit.
Remove stale lock and unlock that was used for testing or experimentally.
This was causing it to wait for something that would never release the lock because it was closed resulting in a deadlock and the main process hanging on exit.
Kevin Day [Sun, 11 Apr 2021 21:53:54 +0000 (16:53 -0500)]
Bugfix: rules not properly handling status.
The status F_known_not was incorrectly being treated as busy.
Make sure to guarantee lock and then unlock of the process_other lock (read lock) to prevent execution of dependent rule while analyzing state of dependent rule.
Read the rule's status after executing from the rule in the rules array and not on the process.
The rules on the rules array is updated after execution but the process is not guaranteed this.
The status of process is generally reserved for the process to manage the status and not to represent the rule status.
Expanded the example serial execution test entries and rules.
Add a new alternative serial so that one uses normal execution and other uses dependent execution (execution is executed by something depending on it rather than being directly started via the entry).
Kevin Day [Sun, 11 Apr 2021 05:01:36 +0000 (00:01 -0500)]
Progress: controller program.
I completely overlooked that the thread time related functions are relative to the absolute system clock.
The timed locks were all acting crazy because I used relative values!
The process status is not being initialized, resulting in invalid and error prone checks.
Begin fixing problems with the asynchronous processing.
Between incorrect uses of times and some stuctural problems, the asynchronous behavior was not operating in the order of the dependencies.
The current work fixes that, but there is still more work to do.
I expect this code to not work just yet.
There also seems to be a locking issue somewhere in here that prevents the program from fully exiting.
Check the thread conditions as those have had problems in the past.
.
Kevin Day [Sun, 11 Apr 2021 04:57:05 +0000 (23:57 -0500)]
Update: Finish implementing the thread condition functions.
It looks like the f_thread_condition_wait() f_thread_condition_wait_timed() functions were not fully written.
Add missing handlers.
The f_thread_condition_wait_timed() was even missing the return case of F_time!
Update the comments as the "time" value is relative to the absolute system clock time.
Kevin Day [Sat, 10 Apr 2021 15:35:00 +0000 (10:35 -0500)]
Bugfix: improve cancellation point processing and handling.
This avoids the use of f_thread_cancel_test() which is horribly limited in design.
Because f_thread_cancel_test() never returns when thread is in cancelled state, the caller cannot properly handle the situation.
This limits the code design in order to properly use f_thread_cancel_test().
There are already cancellation points in place due to the extensive use of thread.enabled.
Improve these cancellation points, adding timed checks to write locks to further check if a cancellation was received (via thread.enabled).
The timeout is arbitrarily pick as "0.1 seconds".
Hopefully, this doesn't make things too busy, but really the write locks should (ideally) never have to be waiting that long anyway.
Remove stale code in private-common.c involving a write lock and then an immediate unlock on deallocation.
Add additional thread.enabled checks.
There were some places where cancellation points were being returned but were not properly unlocking all held locks that are within the scope of the function.
All threads should now be set to PTHREAD_CANCEL_DEFERRED.
The forced thread termination via kill signals are now removed.
They shouldn't be needed now that cancellation (should be) guaranteed.
This will have to be tested over time to confirm the truth of.
Kevin Day [Sat, 10 Apr 2021 15:32:15 +0000 (10:32 -0500)]
Update: handle POSIX compatibility automatically in f_thread_cancel_state_set().
The POSIX standard does not explicitly define that this parameter can be NULL.
To prevent any possible problems, allow NULL but when NULL is received, use a temporary variable in its place.
Kevin Day [Sat, 10 Apr 2021 04:31:13 +0000 (23:31 -0500)]
Bugfix: thread exiting issues and related.
My attempt to add locks to make helgrind ended up backfired on me.
It seems that when an interrupt is received, the cancel is being sent and it happens at any point in time.
Which includes when locks are opened.
When an interrupt cancels a thread with an open lock, that lock is never closed.
Then, using the same locks to handle cleanup of threads resulted in an occasional deadlock.
Remove all locking in the main thread.
The logic should be safe as the only cases where there might be a conflict, a f_thread_join() is in protected between them.
The cancel function should avoid locks to, where possible.
I am somewhat more nervous about this case, I need to review and confirm if main->thread->processs.size does not change.
I previously updated the force cancel exit thread to use waits and locks.
It later dawned on me that I could just get rid of the force cancel exit thread entirely and implement the same functionality in the cancel thread.
Kevin Day [Sat, 10 Apr 2021 00:15:07 +0000 (19:15 -0500)]
Update: lock handling tweaks.
Move the rule lock to inside the if condition block because part of the rule is still being read.
Add missing unlock for other process active lock before break.
Relocate the other process active lock in some cases to be after the last access to the rule.
Utilize process r/w locks around starting and joining the main threads.
On exit, check to see if id_exit thread is actually used before attempting to join it.