Kevin Day [Fri, 1 May 2020 03:36:45 +0000 (22:36 -0500)]
Update: implement missing serialized functions and fix bugs
Implement fl_unserialize_simple() and fl_unserialize_simple_get().
The last character in the last string in the unserialized array gets cut off.
- This happened because the stopping point was incorrectly compensating for the splitter character for the last character in the string.
The width (UTF-8 character width) is not part of the locations size.
Kevin Day [Tue, 28 Apr 2020 05:46:11 +0000 (00:46 -0500)]
Feature: implement string mash support
String mashing is a way to append a string to another with a glue string in between.
The idea is that a space could be placed between the two strings.
A string is primarily used as the mash character so that UTF-8 can be natively supported as the glue character.
Kevin Day [Mon, 27 Apr 2020 02:13:28 +0000 (21:13 -0500)]
Update: short variable name, reorganize function, fix types
The use of parameter_counter ended up being very wordy with "parameters.parameter[parameter_counter]".
- Change this to 'i', since that is unused and traditional.
Reorganize the additional parameter processing loop.
Use uint8_t instead of "unsigned short" for more consistency.
The width_max should be f_number_unsigned and not "unsigned short".
Add additional documentation for console ids in regards to what "empty" means.
Kevin Day [Sun, 26 Apr 2020 07:14:39 +0000 (02:14 -0500)]
Bugfix: remove null pointer check from file stat
Do not assume that stat was passed as an uninitialized pointer.
In the case of passing a class by reference, the stat pointer would be non-zero.
This pointer is not dynamically allocated and therefore not an error.
The null pointer check is therefore invalid for these cases.
Kevin Day [Sat, 25 Apr 2020 17:46:36 +0000 (12:46 -0500)]
Update: redesign byte_dump --last to be inclusive
The --last parameter should be inclusive for consistency with the rest of the project, namely FSS.
By redesigning the --last to internally be represented as a relative offset, the behavior can be simplified.
Doing this also adds support for potential future designs where a --length parameter may be provided that is a relative size parameter instead of an absolute one like --last.
Kevin Day [Sat, 25 Apr 2020 01:54:05 +0000 (20:54 -0500)]
Update: add initial handling of zero-width space in FSS projects
UTF-8 zero-width characters have the potential for being combining characters.
In such cases, the combiner/joiner should be considered part of what is being combined.
That is a combiner/joiner before a graph should be treated as a graph.
A combiner/joiner before a whitespace should be treated as a whitespace.
Disclaimer: I suspect that this will eventually need to be broken down to handle each specific case.
A combiner/joiner on whitespace that results in rendering a printable/visible character would be a violation of the whitespace design principles of FSS.
Further investigation is needed and will likely require changes.
Add appropriate @todo for further development of this functionality.
Rename max_width to width_max to follow newer practices.
Kevin Day [Sat, 25 Apr 2020 00:39:35 +0000 (19:39 -0500)]
Feature: add f_string_length_size max of (uint64_t - 4)
The goal here is that any string processing must be able to add a given UTF-8 width (4-byte) and not oveflow.
Much of the code in this project will not check this as it should be done so at a higher level (performance reasons).
The ideal time is that when allocating some string, always allocate at max f_string_length_size.
Kevin Day [Fri, 24 Apr 2020 02:43:01 +0000 (21:43 -0500)]
Bugfix: fix UTF-8 whitespace detection and provide zero-width detection function
The whitespace detection codes for UTF-8 were incorrect.
Non-printing characters, called zero-width, are not whitespace.
Move them out of the whitespace detection and provide a new function for detecting zero-width.
Handle additional UTF-8 whitespace character codes that I had previously missed.
Kevin Day [Thu, 23 Apr 2020 01:40:50 +0000 (20:40 -0500)]
Feature: implement support for the -T/--trim parameter
Provide support for trimming the object names on input and output.
After implementing this I suddenly remember that the standard might require that the whitespace before and after a valid object name are to be ignored.
This may be removed in the future and fixed in the library.
Additional investigation on how I want to handle this needs to happen first.
The standard is originally designed around ASCII, which only ASCII whitespace is considered whitespace.
This will probably have to be fixed to match the additional goals of the project in terms of whitespace handling.
Kevin Day [Wed, 22 Apr 2020 02:00:18 +0000 (21:00 -0500)]
Feature: expand string functions and utf-8 string functions
The need for this was realized while developing this trim parameter feature.
This has been added as is for commit isolation and may be incomplete.
This utilizes private functions to reduce duplicate code.
While the use of private functions is generally unwanted in this project, this specific case seems to be an exception.
Add rip functions.
Add non-dynamic equivalent of some string functions.
Add trim functions.
Kevin Day [Tue, 25 Feb 2020 03:28:38 +0000 (21:28 -0600)]
Progress: continue development of FSS Extended List
In particular:
- Remove excessive fl_fss_increment_buffer() uses.
- The removed code may be a good idea long term, but for now use a simpler and more efficient approach.
- Fix some mistakes in the slash delimiter handling.
- Begin the initial work for recursively (or so..) processing the nested lists.
Kevin Day [Sat, 23 Nov 2019 04:56:20 +0000 (22:56 -0600)]
Update: build level and monolithic fixes and improvements
Add missing library: fll_file.
Make it even easier to compile against "level" and "monolithic" build processes by providing "--level" and "--monolithic" parameters to the generate.sh script and associated settings files.
Make sure package.sh clears other build modes (when --level is specified, make sur --individual and --monolithic modes are not set).
Kevin Day [Fri, 22 Nov 2019 02:24:43 +0000 (20:24 -0600)]
Update: implement *_delete_simple() and *_destroy_simple() macros and related changes
I do not want a *_delete() and *_destroy() that doesn't provide the ability to handle a status code.
However, in practice, the *_delete() and *_destroy() macros rarely needed the status response checked.
Additional f_status variables were created only to be provided for the status parameter of the stated macros.
This is a waste.
This provides and utilizes alternative *_delete() and *_destroy() macros called *_delete_simple() and *_destroy_simple().
These simple macros do not accept or process the status code.
The resulting code is simpler and easier.
I am on the fence whether or not to throw away the *_delete() and *_destroy() macros that utilize a status parameter.
Future versions may or may not replace the *_delete() and *_destroy() with the *_delete_simple() and *_destroy_simple() macros.
Update *_delete() and *_destroy() macros as needed.
Kevin Day [Wed, 20 Nov 2019 01:43:54 +0000 (19:43 -0600)]
Refactor: make status codes more consistent
Better follow the naming paradigm where reasonably possible.
Use more consistent naming, which should also help with reducing the changes of enum to function name conflicts.
Alphabetically organize status codes per group.
Kevin Day [Wed, 20 Nov 2019 01:00:00 +0000 (19:00 -0600)]
Update: rewrite status code functions and related changes
Fix several problems with the status code processing, namely missing or incomplete data.
Add status code max size defines.
Add missing "#ifndef _di_f_status_codes_" to status codes.
Update cases where 'error' was not renamed to 'code' in status code sources.
Add missing status codes.
Fix incorrect define, "#ifdef _di_fl_status_invalid_" should instead be "#ifndef _di_fl_status_invalid_".
Fix incorrect sizes associated with status code output strings.
Prepare pipe support (not yet implemented).
Ensure f_utf is included and linked in all appriopriate dependencies.
Treat f_utf as a core level_0 project and update documentation accodingly.
Relocate fl_console_parameter_process() into f_console_parameter_process().
Kevin Day [Sun, 17 Nov 2019 06:58:40 +0000 (00:58 -0600)]
Bugfix: When --last is set to 0, entire file is dumped
The --last value being set to 0 is internally used to represent entire file.
Explicitly setting --last to 0 makes no sense, so set the minimum allowed size for --last to 1.
Kevin Day [Sat, 16 Nov 2019 20:50:58 +0000 (14:50 -0600)]
Update: --name and --at in combination should process '--at' relative to '--name'
This logic is already done with --line and other parameters.
Doing this same thing with --name and --at makes the code/logic more consistent and reasonable.
Kevin Day [Thu, 14 Nov 2019 02:54:31 +0000 (20:54 -0600)]
Feature: add support for including empty content in fss_basic_read
Empty content is an object that has no content.
When there is no content for an object, no content is printed for that line and that line is not included in content totals or line selections.
Kevin Day [Wed, 13 Nov 2019 06:12:42 +0000 (00:12 -0600)]
Progress: continue implementing fss_basic_read, also numerous other fixes/tweaks
I decided to allow --at and --name to be used at the same time (and therefore at the same --depth).
The depth code is to be rewritten and that is only partially rewritten.
Many of the parameters are now written and the fss_basic_read needs to be tested and reviewed.
(There fss_basic_read is still incomplete, but there is enough working code to begin testing.)
Kevin Day [Sun, 10 Nov 2019 04:31:20 +0000 (22:31 -0600)]
Feature: add support for duodecimal (base-12)
Now that duodecimal has been added to the FLL project, make sure byte_dump can print in that format.
There is no printf() code for base-12, so implement a custom print process.
Kevin Day [Sun, 10 Nov 2019 04:28:28 +0000 (22:28 -0600)]
Update: implement f_number_signed and f_number_unsigned, as either 32-bit, 64-bit, or 128-bit
Provide the types f_number_signed and f_number_unsigned as a way to define the default "number" type to be used for string to number conversions and array indexes.
By providing 32-bit, 64-bit (default), and 128-bit types, the type can then be adjusted to more easily work on limited hardware or expand to more capable hardware.
This will be the recommended number data type to use in FLL functions going forward.
Kevin Day [Sat, 9 Nov 2019 01:25:07 +0000 (19:25 -0600)]
Progress: rewriting fss_* programs and all dependencies
I am changing the parameters and design of the fss_* programs, such as fss_basic_read.
There is no work done on the fss_*_write programs yet.
There were a lot of changes in the dependencies, including cleanups and improvements.
The parameters passed to the fss_* functions will now be more consistent across each of them.
This should make scripting much easier.
There is a lot of incomplete work and I am focused currently on getting fss_basic_read to work as desired.
I halted my work on f_conversion and fl_console to make sure none of these changes are lost.
I do not expect this commit to compile with everything due to the incomplete work.
I would rather post incomplete code than risk losing code as has happened in the past.
Kevin Day [Fri, 1 Nov 2019 04:07:31 +0000 (23:07 -0500)]
Progress: continue working on fss-003 Extended List
A nested type has been created.
I suspect that I will need to change the structure of the other types to improve consistency, but more review and consideration is needed before any such changes are made.
A read program was written but it is essentially a copy and paste of Basic List, with a few minor changes just to make it compile.
The program arguments of all the FLL programs will need to be changed such that adding support for "depth" selection can be used for things lile Extended List.
The memory allocation is implemented but not reviewed.
I converted the behavior to support nesting, but I need to review the logic to ensure I caught everything.
Kevin Day [Wed, 18 Sep 2019 00:09:44 +0000 (19:09 -0500)]
Update: finish implementing f_utf_character_is_valid() and related UTF-8 changes
UTF-8 BOM is actually not a thing but only a suggestion, see RFC 3629.
I consider it a very bad practice now that I have learned that it is also the zero width space.
Get rid of the UTF-8 BOM support, it is a bad idea and is not to be supported by this project.
The referenced rfc also provides an easier way to view the valid ranges that my previous resources (such as wikipedia).
This helped me finish this function.
Updated byte_dump to better utilize this and to remove no longer necessary code.
Fix an accidental incorrect "invalid detection" check use before calling f_utf_character_is_valid() in byte_dump.
Explicitly print a "." or " " for UTF-8 control characters (ASCII control characters are already handled before this point so it is safe to call f_utf_character_is_control()).