Kevin Day [Tue, 27 Apr 2021 23:49:35 +0000 (18:49 -0500)]
Bugfix: PID and PID file should account for multiples during process execution.
I forgot all about needing to do this so I am considering this a bug.
Each process may execute multiple Actions.
Each Action has its own PID.
In the case of foreground (synchronous) execution, having only a single PID and PID file path on the Process structure is not a problem.
When with PID file (asynchronous) execution operates, multiple PIDs (and respective PID files) may exist for any single Process structure.
Kevin Day [Tue, 27 Apr 2021 04:09:09 +0000 (23:09 -0500)]
Regression: Entry error after failure or during validation is not propogating.
When an Entry fails and successfully executes a failsafe, the failure is not propagated.
The failsafe is meant to bail out and not continue onward, so after a successful or failed failsafe, return F_failure (with error bit as appropriate).
When passing --validate without --test, the program is not exiting as it should.
This is because the state is not being handled just like the status is not being handled.
When joining threads, be sure to reset the identifiers.
Remove now extra Exit processing block.
This is now handled fully by the cancellation function.
Restore the thread enabled state after operating failsafe.
Get rid of simulate variable, instead use the console parameter directly.
This saves memory by a trivial amount.
Kevin Day [Mon, 26 Apr 2021 23:45:23 +0000 (18:45 -0500)]
Update: Implement "execute" support and fix bugs.
Provide new feature for executing into another program.
This is provided via the new Item Action "execute".
The function controller_perform_ready() is being called and the status is being checked but the status is not being assigned.
There are a few cases where thread.enabled needs to be checked and a few cases it does not need to be checked.
Several of the threads need to be aware of the normal/other status to properly determine the thread.eneabled situation.
Without this, they make incorrect decisions that result in bugs.
I did not want to implement a new structure to resolve this so instead provide custom wrapper functions to call that set the appropriate normal/other state.
The functon controller_thread_process_cancel() now needs to be caller aware so that the caller does not get cancelled.
Kevin Day [Sun, 25 Apr 2021 17:13:54 +0000 (12:13 -0500)]
Update: Improve entry/exit verbose messages, display exit with -tv, and wait all with read lock fix.
The verbose messages should be distinguishing between an entry and an exit now.
When both --simulate and --test are specified, the printout should include both the entry and the exit.
Previously, only the entry was printed.
The wait all should only be triggered to wait for all processes at the current moment in time.
Furthermore, the read lock is being held too long (for the entire loop).
Maintain the read lock long enough to build a list of all processes to wait for.
Kevin Day [Fri, 23 Apr 2021 01:49:21 +0000 (20:49 -0500)]
Progress: redesign processing logic to accommodate different process Rule Actions and update dependency design accordingly.
When I implemented the "exit" support (opposite of an "entry") I noticed a oversight in the design whereas there was no way to distinguish between a process that successfully started via the "entry" or the "exit".
Change the design to now utilize unique Rule Processes for each Rule Action requested.
This required further changes to the status handling.
A rule status is now an array of all possible Rule Actions.
This is utilized using a static array for simplicity purposes (there is no need for a dynamic array here).
All of the recent changes introduced a lot more complex code.
There are now helper functions to help facility common tasks.
This should also make updating easier as there is only one place to update.
The downside is the introduction of an additional function call (which is a tiny runtime cost).
To facility this new design, the Rule files must also be aware of the different Rule Actions.
The "need", "want", and "wish" have been relocated into a new Rule Action called "on".
Additional parameters for an "on" allow for describing the Rule Action in which the dependency applies to.
This allows, for example, a "stop" Action to operate in a different order than a "start" Action.
An example of this is provided.
Look at the data/settings/example/rules/serial/*.rule files.
An example syntax is:
on start need serial s_1
on stop need serial s_3
When validate is passed, do not wait for asynchronous processes because they are not run.
Normally this is not noticeable but is exposed when 'script/fail' failed to execute and return ('script/fail' should execute, fail, and return).
This bug is caused by the wait functions not checking to see if the caller is the same as the current process (it was waiting for itself).
This may have been introduced as a result of the redesign.
Kevin Day [Wed, 21 Apr 2021 22:12:00 +0000 (17:12 -0500)]
Progress: Redesign enty/exit rule handling, now requiring Action instead of "rule".
The Entry and Exit files are using "rule" to designate a Rule to operate.
This is designed on the assumption that an Entry always runs a "start" and Exit always runs a "stop".
This behavior is changed such that "rule" is no longer specified and one of the 9 supported Actions may be used.
Such as "start" or "stop", for example.
There is still more work to do as this change doesn't fix the Exit in terms of dependency handling.
Currently, the process structure does not distinguish the Rule action, such as "start" or "stop".
Additional changes to the process structure are needed.
Kevin Day [Tue, 20 Apr 2021 23:16:45 +0000 (18:16 -0500)]
Progress: controller program, add exit support.
This implements the initial work needed to get exit files working.
A new thread is added to handle both "entry" and "exit" to free up the "rule" thread and allow for the "exit" to start will the "rule" thread exists.
The thread enabled process is now more complex given that there needs to be stages so that "exit" threads can work while "entry" or "control" operations begin terminating.
Add several helper functions to help simplify the detection.
Add an "alert" lock to send/receive alerts.
This is a more general purpose lock that tells something waiting on the alert to wake up and see if anything you care about changed.
The file loading now needs to be aware of "optional" files, namely the exit files.
If the named exit file does not exist, then do not care as it is not required (unlike entry and rule files).
Many of the read and write locks are now timed so that they can check to see if the controller is enabled or not.
While working on this I noticed some oversights and things that I did not previously consider.
I need to follow this up with a redesign in the entry/exit rule handling.
Instead of "rule", I should have the rule actions like "start" and "stop".
There needs to be support for default behaviors, such as allow for "stop" to not have a parameter and to correctly find and terminate a service.
There needs to be consideration on dependencies when exiting, perhaps there needs to be an exit-specific dependency management.
Kevin Day [Sun, 18 Apr 2021 15:47:12 +0000 (10:47 -0500)]
Progress: exit prep work, reserve and implement "setting" in entries and exits, add option to "ready', and update documentation.
Add documentation, specifications, and basic structural changes for the "exit" files.
The "entry" and "exit" files now reserve "setting" for designating settings.
Currently, only "entry" supports a setting and that setting is "mode".
The "mode" setting designates how the entry program is intended to behave.
When operating as a "service", the controller program will wait indefinitely for commands (via the not yet implemented "control" program or "control" socket).
When operating as a "program", the controller program will immediately exit after completion.
The "ready" Entry Action now supports "wait".
When "wait" is provided, the "ready" operation will wait for all current asynchronous processes to complete before operating.
Update documentation and specifications, adding the "exit" files.
Cleanup the existing documentation and specifications, fixing wording.
Kevin Day [Sat, 17 Apr 2021 23:44:52 +0000 (18:44 -0500)]
Update: implement with pid execution, simplify related rules.
Implement the with pid execution.
This expects the process to be spawned in the background.
After some review, I decided to remove "use" and "create", replacing those with "pid_file".
The reasons are:
- For "use", the spawned service manages the pid file, so it would be overly complicated to try and manage it in addition to the spawned service.
- For "create", if the process is to go into the background, in order to manage it then there would still need to be a running process (this defeats the purpose).
When the termination signal is received, then inform any background process spawned by the controller program to exit, based on the existence of the pid file.
Kevin Day [Sat, 17 Apr 2021 19:44:21 +0000 (14:44 -0500)]
Update: fix file stream error return, improve file error messages, add new status types and remove status types.
The f_file_stream_close() function is missing some errors.
The file error message printer is missing several error messages.
There is also some cleanup is needed in the file error messages (consistency problems, mostly).
Add F_file_overflow, and F_file_underflow for file specific overflow and underflow.
Add missing F_file_descriptor_not.
Remove F_file_allocation and F_file_deallocation, only the generalized F_memory and F_memory_not are used now.
Kevin Day [Sat, 17 Apr 2021 16:36:17 +0000 (11:36 -0500)]
Cleanup: relocate fl_color code to f_color and remove fl_color.
Ever since f_string became a core/required/special project where all level_0 could depend on it, fl_color no longer needed to be at level_1.
Relocate fl_color into f_color, removing the fl_color project entirely.
Update all dependencies.
This exposed some missing dependencies in fll_program that fl_color is secretly handling.
Fix that as well.
Kevin Day [Sat, 17 Apr 2021 05:41:11 +0000 (00:41 -0500)]
Feature: controller program must support "with full_path".
This is necessary to accomodate fickle programs like SSHD where the full path in argument[0] is required.
This implements a general feature called "with" which is provided to add flags on a per Rule Type basis.
These flags will tweak the Rule Type is some manner.
Only one flag is suppoted at this time: "full_path".
The "full_path" provides the necessary functionality to make SSHD happy.
Kevin Day [Sat, 17 Apr 2021 05:34:34 +0000 (00:34 -0500)]
Update: redesign fl_execute_parameter_option_path in fll_execute.
The previous design of fl_execute_parameter_option_path seemed pointless because it could be detected if a slash is in the progam name.
Recent changes have utilized the slash in the path to do just that.
While working with the controller program the SSHD program revealed that some programs are fickle about what is in their argument[0].
SSHD wants the full path and as such it needs to be provided.
This makes sense as the normal behavior of most programs when started with a full path general expect a full path in argument[0].
The fl_execute_parameter_option_path is redesigned to instead provide a full path for program and for argument[0].
Kevin Day [Sat, 17 Apr 2021 04:00:00 +0000 (23:00 -0500)]
Bugfix: fll_execute does not execute full path.
When passed a full path to a program (rather than depending on detecting in from just a program name) the program does not execute.
There are several logic flaws and mistakes.
- The last_slash should have "+1" to avoid including the slash itself.
- When environment is cleared, it need to potentially use "program" or "arguments.array[0].string" if a slash already exists in the provided name.
- The final NULL at the end of the program_path string is missing.
- The "program" or "arguments.array[0].string" should be used in general instead of always the "program" only.
- The fixated_is is being used incorrectly in private_fll_execute_path_arguments_fixate (and the documentation is incorrect).
I suspect that these functions are messy due to changes in design that were not fully updated (including the appropriate documentation).
Kevin Day [Fri, 16 Apr 2021 23:48:04 +0000 (18:48 -0500)]
Bugfix: properly configure example.
The reason why controller is attempting to execute "/var/run/sshd.pid" is not because it is incorrectly attempting to process the "use" as a start.
Instead, it is because it is properly attempting to do so because it is being told to.
Oops! This is just a simple misconfiguration and not a bug in the code.
A bug in the configuration.
Kevin Day [Fri, 16 Apr 2021 04:20:22 +0000 (23:20 -0500)]
Progress: controller program, begin working on pid file related executions.
It occurred to me that I could quickly create a process with a PID file using a script.
I then realized that "service" only applies to binaries and not scripts.
Add a new type called "utility" that is identical to "service" in purpose except that it processes scripts.
Begin implementing the PID file related code.
It seems there are a few things to address, such as proper test output display.
For example, I am seeing "Simulating execution of '/var/run/sshd.pid' with the arguments: '' from 'SSH Service'.".
That "/var/run/sshd.pid" should instead be "/usr/sbin/sshd".
Creating or checking existence of the PID files is not yet written.
The behavior of after a process forks is not yet written either.
An error needs to be printed and F_failure (with error bit set) should be returned.
The "success = F_failure" should probably have error bit set.
Add new rules and entries for testing this (to be implemented) functionality.
Update the documentation.
Some of the documentation is outdated and as a result, wrong.
Kevin Day [Thu, 15 Apr 2021 02:52:47 +0000 (21:52 -0500)]
Update: implement read lock handlings support.
There are some cases where I am able to figure out a way to make the logic continue on on failure and handle the case of read lock failure.
There are other cases where I am not yet sure how to handle.
Expect more changes in the future to address ability to continue onward on lock failure.
Some functions now return F_lock (with error bit) to designate that the lock failed.
In these cases, it is for functions that require the caller to have a read lock (such as process->lock) locked before starting.
When such functions return the caller now has a way of knowing that the read lock is no longer held.
Kevin Day [Wed, 14 Apr 2021 22:44:13 +0000 (17:44 -0500)]
Update: handle write lock failures, begin adding support for handling read lock failures.
Get the response on write lock failure.
Present an error message.
Return the error status.
Increase the cancellation timeouts from 0.06 seconds to 0.6 seconds, making it less aggresive.
This results in a 90 second max timeout, which gives more problematic exists a lot more time to cleanly exit.
Begin adding support for getting and handing read lock failures.
Read locks attempts will be in a loop that checks main thread enabled as well.
The timeout is longer than write locks to reduce the CPU overhead as there will be a lot of read locks.
Follow up work will utilize this read lock status handling.
Remove the additional thread enabled check that follows a write lock.
The write lock already checks if the main thread is enabled.
Kevin Day [Wed, 14 Apr 2021 22:41:18 +0000 (17:41 -0500)]
Update: f_thread_lock_read_timed() should return F_resource_not.
The pthread_rwlock_timedrdlock() potentially returns EAGAIN.
This needs to be handled, translated into F_resource_not (with error bit), and then returned.
Kevin Day [Wed, 14 Apr 2021 17:08:27 +0000 (12:08 -0500)]
Update: handle asynchronous failures for failsafe, update locks, and related fixes.
The failsafe needs to be triggered when a required but asynchronous process fails.
I originally planned on implementing this via locks but I would rather avoid adding even more locks.
This approach instead provides a wait loop at the end of the entry waiting only on all required processes.
If any of these fail, then the wait will return the requirement failure.
This is a change to the wait all function and behavior, which is updating to now return statuses.
Get rid of process->status, it is no longer needed now that process->rule.status exists.
Having it remain is wasteful and confusing.
The entry processing is updated to be failsafe aware.
This now potentially operates the failsafe rule item.
The failsafe must not do a wait all like normal operation inside of the entry processing function.
There are some discrepancies between "process options" and "rule options".
Technically, there are no "options" on a rule as this is a concept introduced for/by entries.
The process uses these options but most of them were named "rule options".
I then later created a "process options" to add override so that even if the rule is requested asynchronous, if a dependant thread is requiring it, then it will instead run synchronously in the thread of the depending process.
This resulted in both "process options" and "rule options".
They really are the same, so instead remove the current "process options" and rename all "rule options" into "process options".
There are now no longer any "rule options".
I then renamed the "process options" variables to "force options".
Remove signal_all from delete process.
The signals should now all be timed and will exit when thread disabled is set.
The execution used to be designed with the intent that is where the asynchronous processing would handle.
After multiple iterations in design, this is no longer the case.
Update the code to always pass parameter option fl_execute_parameter_option_return instead of doing it only for asynchronous processes.
Add missing re-locks.
Some of the functions require that a certain (read) lock be held prior to calling the function.
When the function returns, the expected lock should still be held.
It so happens that this is not consistently the case.
Certain error or exit states are returning without re-establishing the expected (read) locks.
The controller_rule_find() returns a boolean and not a status with potential error bits.
Fix a block of code where it is checking for error bit on the return value of this function.
The controller_rule_read() also returns a boolean and is being handled as a status with a bit.
In this case, change it to return a status given that this make more sense in this particular case.
Remove relevant/related stale code and add missing comments.
Minor code cleanup:
- Removing no longer valid documentation.
- Cleaning up syntax, such as spacing.
- Updating documentation comments.
- Pass controller_main_t as a pointer in the entry function, updating all uses.
- Remove @fixme for considering status handling because I don't plan on bothering with this now.
A lot of the controller_print_unlock_flush() calls are using the incorrect stream.
Use status_lock as a return in cases where the status that is retuned shouldn't become lost when processing the locks.
Otherwise, the errors may never properly bubble to the parent and therefore may never trigger the failsafe.
The failsafe is not a rule id, but instead an item id.
If and when failsafe support is enabled, trigger it.
The currently implemented code only handles synchronous failures.
Begin work with adding a "failed" flag so that asynchronous, but required, failures can be detected for the purposes of failsafe execution.
Kevin Day [Mon, 12 Apr 2021 00:14:06 +0000 (19:14 -0500)]
Cleanup: remove "local" workaround in execute function.
At the time I was unable to determine the cause.
Now, I strongly suspect it was due to the way threads were being created on the parent stack and the child threads could not access them once a resize function happened.
The resize can, on occasion (and likely more often than not), relocate the address location of the range (array).
This relocation in memory likely caused the weird memory behavior.
Now that the memory is allocated such that the it is an array of allocated pointers, the memory addresses should not change when the array of pointers is resized.
The address to the pointers might change, but from the perspective of the child threads, they don't exist or otherwise matter anyway.
Removing the work around cleans up some clutter and what is essentially a waste of resources.
Kevin Day [Sun, 11 Apr 2021 23:16:16 +0000 (18:16 -0500)]
Bugfix: test mode (-t), should be simulating not executing.
The rule_options is setup but it seems that I forgot to actually pass it.
Set the controller_process_option_execute on the rule_options and then pass rule_options instead of directly passing controller_process_option_execute.
Kevin Day [Sun, 11 Apr 2021 23:03:29 +0000 (18:03 -0500)]
Bugfix: child process PID is not being passed properly and child process prints failure message on terminate signal.
The child process PID was not being assigned because the wrong variable was being passed.
The child processes should not present error messages on exit when termination signal is received.
Make sure that when the child process exists and the main thread is disabled, exit with F_signal.
Kevin Day [Sun, 11 Apr 2021 23:00:09 +0000 (18:00 -0500)]
Bugfix: main process doesn't always exit.
Remove stale lock and unlock that was used for testing or experimentally.
This was causing it to wait for something that would never release the lock because it was closed resulting in a deadlock and the main process hanging on exit.
Kevin Day [Sun, 11 Apr 2021 21:53:54 +0000 (16:53 -0500)]
Bugfix: rules not properly handling status.
The status F_known_not was incorrectly being treated as busy.
Make sure to guarantee lock and then unlock of the process_other lock (read lock) to prevent execution of dependent rule while analyzing state of dependent rule.
Read the rule's status after executing from the rule in the rules array and not on the process.
The rules on the rules array is updated after execution but the process is not guaranteed this.
The status of process is generally reserved for the process to manage the status and not to represent the rule status.
Expanded the example serial execution test entries and rules.
Add a new alternative serial so that one uses normal execution and other uses dependent execution (execution is executed by something depending on it rather than being directly started via the entry).
Kevin Day [Sun, 11 Apr 2021 05:01:36 +0000 (00:01 -0500)]
Progress: controller program.
I completely overlooked that the thread time related functions are relative to the absolute system clock.
The timed locks were all acting crazy because I used relative values!
The process status is not being initialized, resulting in invalid and error prone checks.
Begin fixing problems with the asynchronous processing.
Between incorrect uses of times and some stuctural problems, the asynchronous behavior was not operating in the order of the dependencies.
The current work fixes that, but there is still more work to do.
I expect this code to not work just yet.
There also seems to be a locking issue somewhere in here that prevents the program from fully exiting.
Check the thread conditions as those have had problems in the past.
.
Kevin Day [Sun, 11 Apr 2021 04:57:05 +0000 (23:57 -0500)]
Update: Finish implementing the thread condition functions.
It looks like the f_thread_condition_wait() f_thread_condition_wait_timed() functions were not fully written.
Add missing handlers.
The f_thread_condition_wait_timed() was even missing the return case of F_time!
Update the comments as the "time" value is relative to the absolute system clock time.
Kevin Day [Sat, 10 Apr 2021 15:35:00 +0000 (10:35 -0500)]
Bugfix: improve cancellation point processing and handling.
This avoids the use of f_thread_cancel_test() which is horribly limited in design.
Because f_thread_cancel_test() never returns when thread is in cancelled state, the caller cannot properly handle the situation.
This limits the code design in order to properly use f_thread_cancel_test().
There are already cancellation points in place due to the extensive use of thread.enabled.
Improve these cancellation points, adding timed checks to write locks to further check if a cancellation was received (via thread.enabled).
The timeout is arbitrarily pick as "0.1 seconds".
Hopefully, this doesn't make things too busy, but really the write locks should (ideally) never have to be waiting that long anyway.
Remove stale code in private-common.c involving a write lock and then an immediate unlock on deallocation.
Add additional thread.enabled checks.
There were some places where cancellation points were being returned but were not properly unlocking all held locks that are within the scope of the function.
All threads should now be set to PTHREAD_CANCEL_DEFERRED.
The forced thread termination via kill signals are now removed.
They shouldn't be needed now that cancellation (should be) guaranteed.
This will have to be tested over time to confirm the truth of.
Kevin Day [Sat, 10 Apr 2021 15:32:15 +0000 (10:32 -0500)]
Update: handle POSIX compatibility automatically in f_thread_cancel_state_set().
The POSIX standard does not explicitly define that this parameter can be NULL.
To prevent any possible problems, allow NULL but when NULL is received, use a temporary variable in its place.
Kevin Day [Sat, 10 Apr 2021 04:31:13 +0000 (23:31 -0500)]
Bugfix: thread exiting issues and related.
My attempt to add locks to make helgrind ended up backfired on me.
It seems that when an interrupt is received, the cancel is being sent and it happens at any point in time.
Which includes when locks are opened.
When an interrupt cancels a thread with an open lock, that lock is never closed.
Then, using the same locks to handle cleanup of threads resulted in an occasional deadlock.
Remove all locking in the main thread.
The logic should be safe as the only cases where there might be a conflict, a f_thread_join() is in protected between them.
The cancel function should avoid locks to, where possible.
I am somewhat more nervous about this case, I need to review and confirm if main->thread->processs.size does not change.
I previously updated the force cancel exit thread to use waits and locks.
It later dawned on me that I could just get rid of the force cancel exit thread entirely and implement the same functionality in the cancel thread.
Kevin Day [Sat, 10 Apr 2021 00:15:07 +0000 (19:15 -0500)]
Update: lock handling tweaks.
Move the rule lock to inside the if condition block because part of the rule is still being read.
Add missing unlock for other process active lock before break.
Relocate the other process active lock in some cases to be after the last access to the rule.
Utilize process r/w locks around starting and joining the main threads.
On exit, check to see if id_exit thread is actually used before attempting to join it.
Kevin Day [Fri, 9 Apr 2021 02:10:37 +0000 (21:10 -0500)]
Progress: controller program.
Copy over the rule item action statuses when the rule process completes.
Update cleanup thread:
- deallocate rule data in no longer used space.
- make sure cache thread is being started.
Implement asynchronous example rules for testing.
Disable asynchronous execution when running in validation mode.
This ensures consistency and cleanliness when outputting the validation data.
There is no reason to run asynchronously in this mode because nothing is supposed to be run anyway.
Make sure that execution (simulated or not) is not performed in validation mode.
Make sure the program immediately executes at the end when in validation mode.
Kevin Day [Fri, 9 Apr 2021 00:51:14 +0000 (19:51 -0500)]
Workaround: weird print data when received by a pipe.
Not sure what is going on here, but it seems that the data is not correctly printing via pipes.
Add a flush before unlocking the print locks as workaround.
Kevin Day [Thu, 8 Apr 2021 03:25:05 +0000 (22:25 -0500)]
Progress: controller program.
Update some of the comments.
The "consider" was not being loaded into the "process" array.
Change the behavior to allow for loading the "consider" rules into the "process" array without executing them.
Fix a bug where some entry items were being skipped.
The counter was being added an extra time during the loop, resulting in the skip.
The child processes are not exiting cleanly.
The state F_child was not being fully bubbled up the call stack.
It seems that when forking within a thread, the forked child process, on exit will not result in the thread returning to the caller.
For now there is 1 block (according to valgrind) that is "possibly lost" for every forked child process.
A more in depth investigation is needed.
Add a stub for an asynchronous entry that I will be writing for additional testing of the asynchronous processing.
Kevin Day [Sun, 4 Apr 2021 23:16:13 +0000 (18:16 -0500)]
Update: controller program should continue on error.
Except for certain error codes, like out of memory, record errors and continue onward.
Future versions should somehow handle out of memory issues if this program is to become an "init" replacement.
Avoid unnecessary assignment of rules error status.
In some cases, the rules error status was being assigned twice.
Kevin Day [Sun, 4 Apr 2021 21:52:57 +0000 (16:52 -0500)]
Bugfix: wait condition issues, memory addressing issues, and exiting issues.
The thread conditions, when waiting, may never stop waiting.
Switch to a timed wait, allowing for the condition to determine if it should stop waiting.
I attempting to signal the waits on exit, but this didn't work (however, this will still signal on exit).
Make sure to join all threads, add a failsafe thread join in the process delete function.
The array of processes is a single array.
When this array gets reallocated (resizing to a larger size) the memory addresses may change.
This is a serious problem in that the threads are using those addresses and they suddenly change.
The threads have no way of switching to the new addresses and memory problems happen.
Redesign the array of processes to an array of pointers to processes.
I considered using a block structure, but decided to keep it simple for now.
The downside of this is that on every resize, there must be another allocation for the process address being pointed to.
In the future, I may consider switching to a block structure where I allocated multiple blocks at a time, while still using pointers.
Rename "id_process" to "id_child" to avoid potential confusion between "processes" and "pids", given that I am using "process" with a very different context in this project.
Update all timeouts to be stored in macros.
The cleanup and exit functions were deadlocking, change the locking usage in these functions.
Add missing increment at the end of the loop when processing entry items.
Kevin Day [Fri, 2 Apr 2021 04:54:36 +0000 (23:54 -0500)]
Progrress: controller program.
This gets the program back into a semi-working state.
I've tested the threading and the previous threading problems appear to be gone.
That tells me this massive rewrite/redesign seems to have been worth it.
There is still a lot more to do:
1) Something is not being shutdown properly, on exit (using the control-c termination signal with -t passed to it).
2) The simulate parameters are no longer working (oops!).
3) The printing appears done, but I've confirmed some problems due to design*.
4) I have not done extensive tests, so there may be other regressions.
* As for the case of #3, the problem is with printf() (and fprintf()) and threading:
- The child processes will write to stdout and stderr.
- The controller program also writes to stdout and stderr.
- printf() and fprintf() guarantee locking, but this doesn't help between different executions.
- Writing one massive complete fprintf with, say 20 parameters, is ridiculous.
This leads me to believe the best solution is to write my own printf/fprintf replacement...which is a huge task.
If I do this, then I can get rid of all of that print locking and use the locking provided by the custom functions.
The difference being that these locks would be designated by another, special function, so that the locks can be held across function calls.
The other advantage would be being able to add support for many of the specialized FLL structures (particularly color printing).
The downside is I would need to do a lot of research in witing to the terminal (the putc() and similar stdio functions) and bufferring.
I would then have to handle all of this!
Which is a huge amount of work.
So, for the time being #3 will be ignored and left as a design flaw (or more accurately a design limitation).
Kevin Day [Mon, 29 Mar 2021 01:32:33 +0000 (20:32 -0500)]
Bugfix: fix thread initializers and add documentation on thread allocation requirements.
The PTHREAD_ONCE_INIT cannot be deleted.
The pthread_spinlock_t cannot use PTHREAD_MUTEX_INITIALIZER, so instead use ((pthread_spinlock_t) 0xFFFFFFFF).
Many of the thread structures require dynamic initialization.
Document this behavior.
Kevin Day [Sat, 20 Mar 2021 00:36:28 +0000 (19:36 -0500)]
Progress: controller program and related.
Use "copy" instead of "clone", it seems more accurate given that the code is not guaranteeing the same exact memory structure (only the data is guaranteed).
I believe that I need to document my "completeness principle", documenting the structures and what needs to be done.
I also need to document the exception cases.
Implementing the rule copy function, I realized that I need to have the copy function and not just utilize the "append" functions.
There are many functions where "append" does not make sense.
This means that "copy" must be part of the completeness.
Comment out the cache clearing code.
I will probably get to that part last.
Kevin Day [Thu, 18 Mar 2021 23:43:12 +0000 (18:43 -0500)]
Bugfix: recently added capability functions are from newer version.
The libcap project hasn't been updated in a long time.
Apparently, somebody relatively recently picked up the project and started maintaining it.
This introduced newer functions.
For some reason, my system has a hybrid of this.
The headers show the newer functions but the libraries lack them.
Add a new define "_libcap_legacy_only_".
Enable this by default given how long libcap has exist unchanged.
I also noticed and inconsistency with the function names for users and groups (which are newer functions).
Rename them to not include the "_id".
Remove a duplicate function that didn't even have a reference in the header (oops!).
Kevin Day [Thu, 18 Mar 2021 22:43:51 +0000 (17:43 -0500)]
Update: capability.
Add missing typedefs and cleanup ordering of typedefs.
Add several more functions.
I found more functions than what are implemented now.
The problem is my current systems manpages don't easily find them (such as cap_new_launcher, cap_iab_init, cap_iab_get_vector, etc..).
I will not implement this for now and will need to do some research to see what they are.
Some functions are documented in the manpages as deprecated or obsolete.
I would rather not want to implement any of the deprecated or obsolete functions.
Make sure all "implemented not" functions still do parameter checking.
Use "clone" instead of "duplicate" to be more consistent with the terminology used in this project.
Kevin Day [Thu, 18 Mar 2021 22:13:38 +0000 (17:13 -0500)]
Feature: support f_string_constant_t as a way to pass "const char *" as a parameter.
The parameter will have the return value and needs to be a pointer.
Passing "const char **", the "const" is applied to "char **", which is not what is wanted.
The desired logic is "(const char *) *", but that is not valid syntax.
To achieve this, "const char *" is turned into the typedef "f_string_constant_t".
This allows for "(const char *) *" to be used in the form of "f_string_constant_t *".
This is a special, exceptional, case and there are no plans to support an "array of" or any of the completeness practices.
Kevin Day [Wed, 17 Mar 2021 04:40:52 +0000 (23:40 -0500)]
Update: f_capability improvements and cleanups, and provide f_implemented_not status codes.
Provie F_implemented and F_implemented_not, then use them for capability function support.
This now replaces F_suppoted_not for this particular usage, allowing f_supported_not to be used more naturally for other purposes such as representing codes used by the POSIX libc standard.
Cleanup the documentation comments, updating to the latest style.
Add some missing capability functions (I think there are a lot more to do, so another commit will eventually follow this one).
A number of the status codes were incorrect and/or undocumented.
Make sure F_parameter is handled more consistent (adding it where it is missing).
My increased experience with the POSIX libc standard has led me to believe that it is not safe to use (XX < 0) comparisons when comparing for -1.
This is because POSIX libc is often more loosely defined so if it says -1, that allows for all other negative values to be implementation-specific.
By checking for (XX < 0), this would expose this project to potential implementation specific bugs.
This likely needs to be fixed everywhere, but for now f_capability is where I am starting.
A new function f_capability_supported_ambient() is provided to handle the CAP_AMBIENT_SUPPORTED() macro function.
"eoa" or "End of Array" is being experimentally added as a compliment of "eos" or "End of String".
Remove stale status code from before I fully embraced error and warning bits.
Add more append functions.
Make existing functions more consistent, checking for array overflows, returning F_array_too_large when appropriate.
Update the comments while I am at it.
Relocate the f_string_dynamic_* functions in string.h and string.c to string-dynamic.h and string-dynamic.c, respectively.
This requires that the string_dynamic.h header depend on string_range.h
Kevin Day [Fri, 12 Mar 2021 04:54:45 +0000 (22:54 -0600)]
Update: implement append functions for array types and implement all the other functions.
Start implementing all of the macros as real functions for these.
Macros do what I wanted and save writing code, but the binary size is more of a concern.
Switching to explicit functions should also make debugging easier.
The macros may be removed eventually and all of the code utilizing the macros will be updated to use these.
There will be more changes like this going forward in some other projects like f_string and f_utf_string.
Kevin Day [Sat, 6 Mar 2021 21:55:17 +0000 (15:55 -0600)]
Update: replace f_string_length_t (and related) and miscellaneous cleanus & fixes.
The f_string_length_t, f_string_lengths_t, etc.. types are somewhat redundant.
Simplify the code (reducing binary size as well) and just stick with the f_array_length_t types.
Kevin Day [Sun, 14 Feb 2021 23:19:08 +0000 (17:19 -0600)]
Progress: controller program, address issues with invalid reads/writes.
I have finally identified the cause of the confusing invalid reads/writes.
It seems that using the pointers to memory address directly associated with the thread data and then accessed within a fork() call is a problem.
By copying the data to variables local to a function and then using that memory address, the read/write problems go away.
This commit only performs an immediate fix for the current problem.
What I really need to do next is to rewrite/restructure much of the code with this behavior in consideration.
There were a lot of experimental changes I performed while trying to identify this problem.
There are still some additional locking and other thread problems to solve.
These will hopefully be fixed by the planned rewrite cleanup that I need to do.
Kevin Day [Fri, 12 Feb 2021 03:53:11 +0000 (21:53 -0600)]
Progress: controller program.
Continue working on the thread support.
As I am learning how to use threading, I am finding that my conversion is flawed in some way and somehow losing the pointer address.
I worry that this might be related to how realloc() changes the pointer address, but there is no strong evidence to this fear.
I've decided to just save the current state (regardless of its state) and will rethink the design.
Currently the behavior is more consistent but still problematic.