Kevin Day [Sat, 9 Nov 2019 01:25:07 +0000 (19:25 -0600)]
Progress: rewriting fss_* programs and all dependencies
I am changing the parameters and design of the fss_* programs, such as fss_basic_read.
There is no work done on the fss_*_write programs yet.
There were a lot of changes in the dependencies, including cleanups and improvements.
The parameters passed to the fss_* functions will now be more consistent across each of them.
This should make scripting much easier.
There is a lot of incomplete work and I am focused currently on getting fss_basic_read to work as desired.
I halted my work on f_conversion and fl_console to make sure none of these changes are lost.
I do not expect this commit to compile with everything due to the incomplete work.
I would rather post incomplete code than risk losing code as has happened in the past.
Kevin Day [Fri, 1 Nov 2019 04:07:31 +0000 (23:07 -0500)]
Progress: continue working on fss-003 Extended List
A nested type has been created.
I suspect that I will need to change the structure of the other types to improve consistency, but more review and consideration is needed before any such changes are made.
A read program was written but it is essentially a copy and paste of Basic List, with a few minor changes just to make it compile.
The program arguments of all the FLL programs will need to be changed such that adding support for "depth" selection can be used for things lile Extended List.
The memory allocation is implemented but not reviewed.
I converted the behavior to support nesting, but I need to review the logic to ensure I caught everything.
Kevin Day [Wed, 18 Sep 2019 00:09:44 +0000 (19:09 -0500)]
Update: finish implementing f_utf_character_is_valid() and related UTF-8 changes
UTF-8 BOM is actually not a thing but only a suggestion, see RFC 3629.
I consider it a very bad practice now that I have learned that it is also the zero width space.
Get rid of the UTF-8 BOM support, it is a bad idea and is not to be supported by this project.
The referenced rfc also provides an easier way to view the valid ranges that my previous resources (such as wikipedia).
This helped me finish this function.
Updated byte_dump to better utilize this and to remove no longer necessary code.
Fix an accidental incorrect "invalid detection" check use before calling f_utf_character_is_valid() in byte_dump.
Explicitly print a "." or " " for UTF-8 control characters (ASCII control characters are already handled before this point so it is safe to call f_utf_character_is_control()).
Kevin Day [Tue, 17 Sep 2019 00:49:00 +0000 (19:49 -0500)]
Progress: finish the main parts of invalid UTF-8 detection
This wraps up the work needed for all explicitly declared invalid sequences.
There are some sequences, such as "Overlong", that are considered invalid (according to Wikipedia at this time) but the source (namely Wikipedia) does not explicitly declare what they are.
I need to figure out what these really are and handle them.
There are also likely cases of accidental copy and paste that will be fixed as I discover them (sorry, the size of documentation I had to go through to get these invalid sequences is massive to me).
There are also some @todo situations that I would like to resolve.
Kevin Day [Sun, 15 Sep 2019 03:35:26 +0000 (22:35 -0500)]
Update: disable init until I can get around to it
I decided to start at least clean up some of the compile errors, but this was simply too much of a mess.
Instead, just comment out code and deal with it later.
Kevin Day [Sat, 14 Sep 2019 20:59:45 +0000 (15:59 -0500)]
Progress: begin converting byte_dump to using f_utf_character_is_valid()
The function, f_utf_character_is_valid(), can be a bit expensive, so only call it if the current character is not already known to be invalid.
The function, byte_dump_print_text(), will need to be updated as well, given that the invalid range now includes some sequences currently being swapped with a space.
Kevin Day [Sat, 14 Sep 2019 00:38:52 +0000 (19:38 -0500)]
Update: begin improving UTF-8
I am now moving to perform a more thorough implementation of UTF-8 support.
Cleaned up the functions.
Due to the sheer size of the changes needed, I am uploading this is stages to ensure nothing gets lost.
The work done is incomplete.
The funtions will need to be reviewed once everything is in place.
Kevin Day [Thu, 12 Sep 2019 22:17:19 +0000 (17:17 -0500)]
Update: use int8_t instead of char
Guarantee that we are always dealing with 1-byte values by using int8_t instead of char.
They should be identical, but this prevents a given system from doing something different.
char by default is signed.
Kevin Day [Thu, 12 Sep 2019 03:38:30 +0000 (22:38 -0500)]
Update: documentation for f_pipe and add additional pipe functions
Provide f_pipe_warning_exists(), f_pipe_error_exists(), and f_pipe_debug_exists().
In theory, the program should be able to grab data piped from any of these sources, if both the source exists and a way to pipe the source exists.
Kevin Day [Thu, 12 Sep 2019 02:03:08 +0000 (21:03 -0500)]
Update: start enum's at 1 where possible
By always setting enums as 1, the 0 value can be reserved as not-set.
There are still a few situations where enums must not start at 1.
Some are:
1) Type defenitions, so as in f_types where the status codes need to start at 0 for f_false.
2) Any enums that map 1to1 to an array, such as with parameter options.
Kevin Day [Tue, 10 Sep 2019 00:44:12 +0000 (19:44 -0500)]
Update: Add 3 presentation modes to byte_dump: normal, simple, and classic
Normal presentation will replace ASCII control or whitespace character with the UTF-8 characters that represent this with a picture character.
Simple presentation will use a single space to represent any given ASCII control or whitespace character.
Classic presentation will do what the "hexdump" tool traditionally does and use a single period to represent an ASCII control or whitespace character.
Kevin Day [Mon, 9 Sep 2019 04:19:10 +0000 (23:19 -0500)]
Update: remove common type wrappers and use typedef instead of '#define'
I intend to begin transitioning from the core types like 'int', 'char', etc...
As part of this, I need to remove a number of the type #define wrappers.
This is also done, in part, because I learned that there are some equivalents to f_min_s_int.
Using explicit types is safer and better designed than something like 'char'.
The goal will be to replace 'char' with uint8_t (or int8_t as needed).
Furthermore, specifying int32_t and int64_t (and similar) should improve the code quality.
The use of types like "wchar", is dangerous because some systems use different sizes.
Instead, for something like "wchar", an uint32_t, might be used.
(although this project is to be designed around UTF-8 so the use of wchar is wrong anyway but it does make good example.)
Kevin Day [Mon, 9 Sep 2019 03:57:47 +0000 (22:57 -0500)]
Update: add a space after "combining" characters and catch a few more invalid UTF-8 sequences
Previously, I just printed a space instead of printing the "combining" characters.
It occurred to me that I could print a space following a known "combining" character to cause it to combine into a space.
This makes things easier to view and still displays the combining character instead of hiding it behind a blank space.
The downside is that this might cause problems if someone tried to copy and paste these combined characters.
Catch a few more invalid UTF-8 sequences that I came across while making these changes.
Fix an existing invalid UTF-8 sequence detection that seems to have been incomplete and incorrect.
Kevin Day [Sun, 8 Sep 2019 21:46:27 +0000 (16:46 -0500)]
Cleanup: replace argc and argv usage with a single structure of argc and argv (f_console_arguments)
Simplify the parameters being passed to functions by providing a helper structure called f_console_arguments to handle the argc and argv standard arguments.
Due to being standard arguments, I am leaving the names as 'argc' and 'argv' despite it being a violation of the naming policy of this project.
('argc' should be something like 'used', and 'argv' should be 'arguments'.)
The firewall had a naming conflict, so rename the usage of "arguments" in firewall into "parameters".
Kevin Day [Sun, 8 Sep 2019 20:39:29 +0000 (15:39 -0500)]
Regression: display "+" and "++" and not "-" and "--" for special parameter options
When I wrote fll_program_print_help_option() I completely forgot to provide a way to set eithe "-" or "+" and "--" or "++".
This resulted in the "--help" display of the options to incorrectly print using "-" and "--".
Add additional function parameters to allow setting the symbols when calling fll_program_print_help_option().
Kevin Day [Sun, 8 Sep 2019 20:22:41 +0000 (15:22 -0500)]
Update: Simplify console priority checking by providing fl_console_parameter_prioritize() and fix name of fl_console functions
Simplify the code for determining which console parameter of some set of console parameters has priority.
Abstract this functionality into its own function so that other projects can leverage this.
The functions in fl_console should be prefixed with fl_console.
Kevin Day [Sun, 8 Sep 2019 04:25:50 +0000 (23:25 -0500)]
Feature: bit_dump level 3 program
Provide a program to help analyze files, supporting UTF-8.
This should work similar to "hexdump" but is not intended to match it feature for feature.
Provides three byte printing modes (with plans for a fourth):
1) hexidecimal (default)
2) octal
3) binary
4) digit (planned)
Provides first and last byte selection support.
A width option is available for specifying the number of bytes to be printed on screen such that each byte is essentially a data column.
With a width of 16, then there would be 16 data columns, each displaying one byte.
Although similar to "hexdump", the first column in bit_dump represents the specific row number.
A text option is provided to display the bytes as a character (similar to how "hexdump" uses "-C").
A placeholder option is available for showing a placeholder where placeholder spaces would otherwise be printed.
A placeholder is printed to ensure alignment.
For example, a printable UTF-8 character that is 3-bytes wide would only visibly take up 1 character of space.
To keep the alignment with text to bytes accurate and consistent, two additional placeholder spaces are appended following the UTF-8 characte.
If the bytes terminate before an entire column set of bytes are printed, then spaces or placeholders are printed until the full column may be printed when in "text" mode.
This will detect and report invalid UTF-8 codes.
Handling printing the characters (via the text option) can be tricky.
There is more work needed to catch all cases.
Some cases cannot be handled if the character is wider than the expected width (causing alignment printing issues).
I am still a bit inexperienced with the intricacies of UTF-8 and I expect there to be issues in this first pass.
Try to avoid returning f_invalid_parameter to represent invalid parameters for standard C/POSIX functions.
Implement new exceptions, like f_invalid_name and f_invalid_desciptor, to accommodate this.
Kevin Day [Tue, 3 Sep 2019 04:21:08 +0000 (23:21 -0500)]
Update: console code and some macros
Add explicit dark mode console option.
Add support for order of operations priority on parameters ("+n +l +d" would result in colors for dark background because +d overrides both +n and +l due to being the right-most parameter).
Rewrite level_3 programs to utilize the new fll_program_process_parameters() helper function.
Fix some screwed up macro definitions and uses.
Kevin Day [Mon, 2 Sep 2019 05:50:51 +0000 (00:50 -0500)]
Update: rename fl_program as fll_program, rename f_colors and fl_colors, move functions from fll_colors into fl_colors, and provide addition fll_programs functions
Cleanup the parameter processing code.
In the process:
- The fl_program was moved into fll_program.
- fll_colors_load_context() is at too high of a level, move it to fl_colors_load_context() (and deleting the now empty fll_colors project).
- Update programs to use the update fll_program helpers.
- The f_colors and fl_colors should now be f_color and fl_color.
Kevin Day [Mon, 2 Sep 2019 00:21:40 +0000 (19:21 -0500)]
Feature: add 'dependency' generation to package.sh and update all dependencies
This will help ensure that dependencies will be accurate and less error prone (so long as the script is run after making changes to any projects).
Dependencies are processed from the projects individual data/build/dependencies file.
The order of the dependencies does matter and is processed from top to bottom.
The 4 core dependencies must be first if they are depended on (and in this order: f_type, f_status, f_memory, and f_string).
The dependency generation for individual projects will generate the libraries to link against if a given project has any library source files.
Linking is done in highest level to lowest level to help ensure no linking errors happen.
The dependency generation for level projects and monolithic project are done based on the library and header sources.
All files specified in the build_sources_library and build_sources_headers are generated.
This script has been run and the dependency updates generated by this script are included in this commit.
Kevin Day [Sun, 1 Sep 2019 07:13:01 +0000 (02:13 -0500)]
Update: implement utf strings, ensure endianess, and add isgraph()/isspace() methods to UTF-8 equivalents
Expand the UTF-8 character type (a 4-byte wide character represented as a big-endian 32-bit integer) into working like f_string and f_dynamic_string.
Provide all similar functionality.
I have decided that the isgraph(), isspace(), etc.. functions fo UTF-8 should also call the ASCII equivalents.
Update all relating code.
Use memcmp() and memcpy() for comparing UTF-8 characters class (4-byte integer) to the UTF-8 char strings (multiple 1-byte char).
When doing this, make sure to do so with the proper endianess.
Add missing f_utf_character_to_char() function.
Wrap some of the macros parameters in parenthesis for safety reasons.
Add f_utf_is_big_endian() and document its use.
Provide custom EOL, EOS, and placeholder defines for UTF characters (4-byte integers).
Kevin Day [Sat, 31 Aug 2019 23:07:16 +0000 (18:07 -0500)]
Update: improve status code handling, remove unnecessary code, and update status programs
The fl_status_is_fine(), fl_status_is_warning(), and fl_status_is_error() functions are no longer needed or valid with the current design around using error and warning bits.
The status conversion code should be more aware of digits.
Get rid of the "context" parameter in the status programs.
Redesign logic in the status programs to work with changes and function correctly (and consistently).
The status programs will eventually need to perform more extensive tests on parameters when digits or non-digits are required.
The current design only provides a very basic test on the first character of a given parameter.
Kevin Day [Sat, 31 Aug 2019 21:19:10 +0000 (16:19 -0500)]
Bugfix: do not include private headers
Manually including private headers from within public headers will result in compilation errors.
The private headers will still get included via the private source file.
Kevin Day [Sat, 31 Aug 2019 20:59:55 +0000 (15:59 -0500)]
Update: handle invalid UTF-8 fragments
A 1-width UTF-8 character (that is not a valid ASCII character) is used to designate part of a complete UTF-8 character block (aka: 1-width UTF-8 characters are fragments).
Because this fragment cannot exist in isolation, it must be handled as either an invalid or an incomplete UTF-8 fragment.
Provide new status codes for handling incomplete UTF-8 fragments.
Update appropriate functions to detect and handle these invalid or incomplete fragments.
Kevin Day [Fri, 30 Aug 2019 17:48:54 +0000 (12:48 -0500)]
Update: redesign console processing code
Simplify the console structure and reduce the size of codes and parameters.
The "extra" parameter seems a bit overkill, remove it and help keep this project more along the lines of the idea of "Featureless".
Rewrite and document fl_process_parameters().
Implementing functions were only functionally updated.
Additional changes are likely necessary for the logic, such as supporting multiple calls like "program -h +n +l", which by order of operation the final "+l" should override the "+n".
Kevin Day [Fri, 30 Aug 2019 01:10:01 +0000 (20:10 -0500)]
Cleanup: f_color, fl_color, and related
In particular, some of the color print functions are not following the naming convention.
The function fl_print_color() should instead be fl_color_print().
Make sure appropriate #define statements have macro in their name.
Add some @fixme comments because f_dynamic_string is designed with the intentions of not being NULL terminated.
Directly using it with standard functions like fprint is dangerous.
Kevin Day [Thu, 29 Aug 2019 21:57:34 +0000 (16:57 -0500)]
Cleanup: rename f_errors to f_status, fl_errors to fl_status, and fll_errors to fll_status
Originally f_errors was meant only for error handling but it quickly turned into status code handling (which includes errors).
The naming system of f_errors is now confusing and misleading so change it to f_status.
This makes far more sense, for example:
- f_error_is_error vs f_status_is_error.
- f_error_set_error vs f_status_set_error
Kevin Day [Thu, 29 Aug 2019 19:40:25 +0000 (14:40 -0500)]
Cleanup: rename return_code to status_code and fss_return_code to fss_status_code
The "return codes" were originally intended to be literal return codes.
When the error codes were converted to have error bits, warning bits, and signal bits, this no longer became the case.
Refactor return_code into status_code to be more accurate.
Kevin Day [Fri, 23 Aug 2019 03:50:50 +0000 (22:50 -0500)]
Update: implement UTF-8 support in fss processing code and add additional functionality
Additional functionality includes implementing f_utf_character in f_utf.
Includes numerous other small UTF-8 updates.
Some macros have been wrapped in parenthesis to avoid unobvious issues such as when adding an exclamation before a macro call (and the possible order of operation issues).
Kevin Day [Fri, 9 Aug 2019 00:13:21 +0000 (19:13 -0500)]
Feature: support custom 'defines'
Some libraries or packages have custom defines, such as the firewall _en_firewall_debug_.
Provide simple documentation of the defines, in the data/build/defines file.
Provide build settings values for specifying these custom defines.
Provide parameter for manually overriding defines.
Some path settings have been renamed to avoid confusion (such as '--c_path' is now '--path_c').
Kevin Day [Tue, 30 Jul 2019 01:35:02 +0000 (20:35 -0500)]
Update: add build_libraries_fll build setting
Add a new parameter to make it easier to switch between individual compilation, level compilation, and monolithic compilation.
This parameter will only be used for fll-specific libraries.
For programs (aka: "level 3") commented out examples for linking against level-based and monolithic are provided.
The configurations can then be easily swapped out by a couple of simple sed statements, such as:
- sed -i -e 's|^build_libraries_fll\>|#&|g' data/build/settings
- sed -i -e 's|^#build_libraries_fll-level\>|build_libraries_fll|g' data/build/settings
The above example will switch to the level based compiling, while disabling the individual compiling.
The level based compiling compiles each of the levels 0, 1, and 2 as a single library for each level, resulting in libraries such as:
- libfll_0-0.5.0.so
- libfll_1-0.5.0.so
- libfll_2-0.5.0.so
The monolithic based compiling compiles all of the levels 0, 1, and 2 as a single library, resulting in libraries such as:
- libfll-0.5.0.so
The standard names of individual, level, and monolithic do not overlap by default and can therefoe be installed side-by-side.
Kevin Day [Mon, 29 Jul 2019 04:01:42 +0000 (23:01 -0500)]
Feature: work directory support
Work Directory provides an easier way for developers to compile and test a particular set of FLL libraries and programs without conflicting with the host system.
If the host system has some version of the FLL project installed, the versions in the work directory will be used instead of the system directories.
Specifying the work directory is done via the '-w' or '--work_directory' commands.
To better achieve this functionality in the install.sh script, four new additional parameters were created:
- --libraries-static
- --libraries-shared
- --programs-static
- --programs-shared
These provide additional relative or absolute paths for installing the programs and libraries into.
The relative paths for --libraries-static and --libraries-shared is the library directory (which can be specified via --libdir).
The relative paths for --programs-static and --programs-shared is the program directory (which can be specified via --bindir).
Kevin Day [Mon, 29 Jul 2019 01:34:08 +0000 (20:34 -0500)]
Cleanup: private firewall files do not need to be #ifdef wrapped
The #ifdef wrappers are intended for custom overrides, which should apply only to functions treated as "public".
These firewall files beginning with private- are private and do not need these wrappings.
Kevin Day [Sun, 28 Jul 2019 22:02:49 +0000 (17:02 -0500)]
Bugfix: install script destination parameters not being respected
I used the wrong name in the grab_next variable when designating to grab the next includedir and libdir.
There is also a mistake where I was copying the destination_prefix onto itself.
Kevin Day [Sun, 28 Jul 2019 21:48:42 +0000 (16:48 -0500)]
Update: always return 1 on failure
There were some cases where exit is not being called and other cases where exit 0 is being called.
Make sure that exit 1 is called on error so that this script can then be scriptable.
Kevin Day [Sat, 27 Jul 2019 21:31:09 +0000 (16:31 -0500)]
Security: set default policy to DROP after deleting chains
Performing numerous syscalls can by slow.
During this time, if the default behavior is open, then unwanted packets may make it through.
By dropping by default, these packets will not go through.