NuFX, short for "New File Exchange", is a file format developed byAndy Nicholas for archiving files and disks on the Apple II series ofcomputers. The format was devised in tandem with thedevelopment of ShrinkIt, which became the standard archive software for theApple II soon after it's release in 1989. NuFX archives usually have filenames that end in ".SHK".
This document describes the API (Application Program Interface) for NufxLib,a library of functions that manipulate NuFX archives.
Good engineering practices dictate that an API should be minimal and complete. Theconfusion generated by redundant and overlapping interfaces can be as harmful asan omitted vital feature. I feel pretty good about the"complete" part, since NufxLib provides a way to do pretty mucheverything that I can reasonably expect somebody to want to do, but in somecases "minimal" has been swept aside in the name of convenience. (See theDesign Notes section for additional commentary on this topic.)
The NuFX specification is extremely general, and does not explicitly allow orforbid unusual conditions like having a record with two filenames in it. NufxLib follows the NuFX specification on everything that is spelled out, butrestricts some of the undefined behaviors to the subset defined in theNuFXAddendum.
In this document, the term "threads" usually refers to NuFXthreads -- structures in the archive -- not CPU threads.
That explains what I set out to do. Here's a quick summary of what Iaccomplished.
The library is protected by copyright, but can be distributed under the termsof theBSD License. See the file "COPYING-LIB" forfull details.
Some changes were made during the development of NufxLib that broke binarycompatibility with version 1.1 of the library. The changes were:
In addition, a NuTestRecord call was added.
Applications written against v1.x may need to be updated. Check theNufxLib "samples" directory for examples of programs that use theupdated calls.
To make version management easier, v2.x includes the version number in theNufxLib.h header file. This allows dynamically-linked applications tocompare a "compiled" version against a "linked" version.
This was a major source code cleanup effort, one aspect of which was switching from general C types ("unsigned long") to types with explicit sizes ("uint32_t"). In some cases this caused some compilers to report errors, even though there's a fair chance that binary compatibility wasn't affected. Since it was an API-breaking change at some level, the major version number was bumped.
The other major API change was the separation of Mac OS Roman and Unicode strings, which were previously blended freely.
This document assumes that you are already familiar with the NuFX archiveformat, as described in theApple II File TypeNote for $e0/8002 and the Winter 1990 issue of Call-A.P.P.L.E. For those unwilling to wade through the technicaldocumentation, here is a quick overview.
A NuFX archive is composed of a Master Header followed by a series ofRecords. Each Record is composed of a Record Header and one or more Threads. The generalidea is to store one file per Record.
Each Thread holds a blob of data. The data can be a data or resourcefork of a file, a disk image, a comment, or the filename for theRecord. The Threads are identifed by a "class" and a"kind". The "class" tells you if it's a data thread,comment, filename, or something else, and the "kind" refines theclass. For example, the resource fork of a file is a data-class thread with a"kind" of 2.
Some Threads, notably filenames and comments, are pre-sized, meaning that thespace allocated for them in the archive is larger than what is actuallyused. Filenames usually have at least 32 bytes set aside for them, thoughin practice a simple ProDOS filename will be shorter. This makes itpossible to rename files and update comments without having to reconstruct thearchive.
The archive Master Header has only a few bits of information, such as thenumber of records and the date the archive was created. Unlike a ZIParchive, NuFX has no central table of contents. If you want to display thecontents of an archive, you have to read the first Record header, pull thefilename out (usually by finding and reading a filename Thread), compute the totalsize of the Record, and seek forward past the data. Repeat theprocess with each subsequent record, until you reach the end of the archive.
The predominant compression algorithm is a slightly modified LZW (Ziv-Lempel-Welch). It'sfast, but not very effective compared to the standard methods used in modernarchivers.
There are five basic categories of API calls.
ReadOnly calls do not modify the archive in any way. Theoperations include things like listing and extracting files. These can beused on archive files opened read-only or read-write.
StreamingReadOnly calls are a subset of ReadOnly calls that can bemade on a streaming archive. A "streaming" archive is one thatcannot be seeked, e.g. an archive being received over a network socket or a pipefrom stdin. The same functions are invoked as for ReadOnly archives, butin some rare cases the results may be different.
ReadWrite calls change the archive contents. Functions that addand delete files are here. These can only be used on archive files openedread-write.
General calls can be made regardless of how the archive wasopened. Functions included here can get and set archive parameters anddefine callbacks.
Helper functions don't do anything to the archive. They're functionsor macros that do useful things with some of the data types returned. They're included as a convenience.
The library does everything it can to aid multi-threading. You shouldbe able to perform simultaneous operations on multiple archives (assuming youhave the reentrant versions of certain libc calls available). You cannot,however, invoke multiple simultaneous operations on a single archive.
There is a general philosophy of laziness employed. For example,opening an archive does not cause the entire table of contents to be read. (In a NuFX archive, that would require scanning through most of the file.) As a result, there are actually three different ways to get the table ofcontents out of an archive:
For write operations, a certain form of laziness is again employed. Ifyou want to delete three records from various points in an archive, you don'twant to have to update the archive three times. NufxLib handles this bydeferring all write operations until a "flush" call is made. In most cases, a"flush" results in a new archive being constructed in a temp file,which is subsequently renamed over the original. The flush call does notclose the archive, so it is possible to do things like:
In certain restricted cases, such as updating a comment or appending newrecords,the original archive can (optionally) be updated in place, saving a potentiallylengthy copying of data.
As a final example of laziness, NufxLib does not re-read the archive it hasjust written after a Flush. It would have been easier to write allchanges, throw out all data structures, and re-read the archive from scratch,but that could be slow. Instead, the library keeps track of the changes ithas made -- something that gets a little tricky when filename threads areupdated. Being lazy is often more work.
Filenames stored in archives use the Mac OS Roman character set. The low 128 characters are ASCII, the high 128 are specifiedhere. NufxLib will convert between Mac OS Roman and Unicode when necessary, and provides conversion functions for application use.
When specifying a "local filename", i.e. a file on Linux or Windows, the API expects a Unicode string. When referring to an archived file by name (the "storage name"), the API uses the Mac OS Roman form. The parameter and field names reflect the character set ("UNI" or "MOR"), and use the UNICHAR type for Unicode strings. On Linux and Mac OS X the filename is encoded with UTF-8. On Windows it should be encoded with UTF-16, but that hasn't been implemented yet, so the API still uses 8-bit characters and effectively treats MOR strings as if they were Windows Code Page 1252. (This means the behavior of NufxLib is essentially unchanged for 3.0 on Windows.)
All API calls and data types begin with "Nu", and all constants start with "kNu". All internal functions start with "Nu_", and any internal data tables withglobal scope start with "gNu". Hopefully these rules will avoidcompile-time and link-time name conflicts.
For details about the fields available in different structures, see theNufxLib.h header file. Everything in NufxLib.h ispublic. Most of these types have a direct analog with afield or structure in the NuFX specification.
UNICHAR (char -or- wchar_t): All filenames for "local" files, i.e. files on the Linux or Windows filesystem, should use UNICHAR. This will bechar
on Linux and Mac OS X. Someday it will bewchar_t
for Win32, but for now it's an 8-bit char there as well.
Windows uses UTF-16 encoding, sowchar_t
is required. (Unicode filename handling for Windows is incomplete, so the code does not currently use wide chars.)
NuError (enum): Most library functions return NuError. A valueof zero (kNuErrNone) indicates success, anything else indicates failure. Values less than zero are NufxLib errors, while values greater than zero aresystem errors (like ENOENT).
NuResult (enum): Callback functions return these values to tellNufxLib how things went. For example, an error callback can tell thelibrary to Abort, Retry, or Skip. (Okay, it can Ignore too.)
NuRecordIdx andNuThreadIdx (uint32_t): These are used toidentify a specific record or thread in API calls. Their values are assigned when the archive file isread. They aren't reused, so if you delete some records and add some newones, the indices of the deleted records won't appear again. Do not assumethat the indices start at a specific value or are assigned in a particularorder. The indices are assigned when the archive is opened, and if youclose and reopen the archive, they may be completely different.
NuThreadID (uint32_t): This is a combination of the 16-bit"thread class" and the 16-bit "thread kind". Constantsare defined for common values, e.g. kNuThreadIDDataFork (0x00020000) indicates a data fork.
NuThreadFormat (enum): An enumeration of constants representing the16-bit "thread format" value. This is used to specify a type ofcompression (uncompressed, LZW/1, LZW/2, etc).
NuFileSysID (enum): An enumeration of GS/OS file system identifiers.
NuStorageType (enum): An enumeration of ProDOS storage types. There are extended (forked) files, directories, and three types of plain files.
NuArchive (opaque struct): This is the fundamental state structure forall API calls. Every call takes one of these as an argument. Thestructure contains all of the information about the archive and pendingoperations.
NuCallback (pointer to function): Callback function declarations mustmatch this type. An example would be "NuResult MyFunction(NuArchive* pArchive, void* args)".
NuValueID (enum): An identifier for settable values. You canchange certain NufxLib parameters after opening an archive. This enum ishow you specify which parameter you want to change.
NuValue (uint32_t): The new value for the parameter specified bythe NuValueID.
NuAttrID (enum): An identifier for archive attributes. You canget information about archive attributes (characteristics of the archive itself)through a NufxLib interface. This type has an enumeration of the legalvalues.
NuAttr (uint32_t): The value for the attribute specified by theNuAttrID is placed in one of these.
NuDataSource (opaque struct): Some of the fancier NufxLib calls allowyou to use data from a file on disk, a file that's already open, or a buffer ofmemory. This struct contains that specification.
NuDataSink (opaque struct): Like NuDataSource, this specifies a datalocation. This struct is for data being extracted.
NuDateTime (struct): This holds the date and time in an expandedformat, using the same structure as TimeRec from "misctool.h" on theIIgs.
NuThread (struct): The fields from the thread header, as well as a fewnew ones like the absolute file offset, are accessible.
NuRecord (struct): This has all of the fields from the NuFX Recordstructure, as well as some convenience fields (like "filename", whichalways points to the right filename whether it was stored in the record headeror came out of a thread). Some calls cause a NuRecord structure to bepassed to a callback function, where it can be accessed directly. TheThreads are represented as an array of NuThread structures attached to theNuRecord.
NuMasterHeader (struct): This holds the data from the archive's masterheader block.
NuRecordAttr (struct): Some of the fields in a NuRecord can bechanged, such as the file type and modification date. This structurecontains the modifiable fields, and is used as an argument to two of the APIcalls.
NuFileDetails (struct): When adding files, it is up to the applicationto supply many of the details about the file, such as the file type, accesspermissions, and modification date. This structure provides a way to passthose values into the library.
NuSelectionProposal (struct): Selection callback functions receive oneof these.
NuPathnameProposal (struct): Pathname filter callback functionsreceive one of these.
NuProgressData (struct): Progress update callback functions receiveone of these.
NuProgressState (enum): A component of NuProgressData, this tells thecallback function what the library is doing.
NuErrorStatus (struct): Error handling callback functions receive oneof these.
Files are referenced with standard libcFILE* pointers. Thelibrary uses fseek and ftell, which are defined by POSIX to take a signed longinteger for the offset argument, so archives larger than 2GB cannot be handled.
These interfaces can be used on read-only and read-write archives. Asubset, described later, can also be used on streaming-read-only archives.
Creates a new NuArchive structure for the "archivePathname"file. The file will be opened in read-only mode.
Attempting to use ReadWrite interfaces on a read-only archive will fail.
Read the list of entries from the archive. If the full table ofcontents has already been read, the in-memory copy will be used.
"contentFunc"is a callback function that will be called once for every record in the archive. The callback function should look something like this:
NuResult EntryListing(NuArchive* pArchive, const NuRecord* pRecord)
(Depending on your compiler, you may have to declare "pRecord" as a void*and cast it in the function.)
The record passed to the callback function does not reflect the results ofany un-flushed changes. Additions, deletions, and updates will not bevisible until NuFlush is called.
The application must not attempt to retain a copy of "pRecord"after the callback returns, as the structure may be freed. Anything of interest should be copied out.
Try to extract all files from the archive. Each entry is passed throughthe SelectionFilter callback, if one has been supplied, to determine whether ornot it should be extracted. The OutputPathnameFilter callback is invokedto covert the filenames to something appropriate for the target filesystem.
On systems that support forked files, a record with both data and resourceforks can be extracted to the individual forks of the same file. Onsystems without native support for forks, the data can be extracted into twodifferent files by using the OutputPathnameFilter. If the system doesn'tsupport forks, and no OutputPathnameFilter is specified, then the forks will beextracted into the same file. Depending on the value of thekNuValueHandleExisting parameter, this could result in one fork overwriting theother, in one fork not getting extracted, or in the HandleError callback gettinginvoked. (The HandleError callback can choose to rename the file,overwrite it, skip the current entry, or abort the entire process.)
The global EOL conversion setting is applied to all threads, but isautomatically turned off for disk image threads.
Extract a single record. Otherwise identical to NuExtract. TheSelectionFilter callback, if specified, will be invoked.
There are a number of ways to get the recordIdx. You can callNuContents and use the callback to find the one you want. You can get the recordIdx by the filename stored in thearchive, withNuGetRecordIdxByFilename. Or, you can get it by the record's offset in the archive,using NuGetRecordIdxByPosition.
Extract a single thread. Specify the thread index and a place to putthe data. The SelectionFilter callback, if specified, will be invoked.
Remember that, if EOL conversion is enabled in the data sink, the amount of data that comesout of a thread may not match pThread's "actualThreadEOF" value.
(In some ways it doesn't really make sense to call the SelectionFilter callbackwhen a specific thread has been singled out for extraction. However, it'seasy to disable (set the callback to NULL), it may prove useful, and it keepsthe interface consistent.)
The NuTest call is functionally equivalent to NuExtract in every way but one:it doesn't actually extract anything. If you want to test a subset of thefiles, supply a SelectionFilter callback.
This won't test filenames or comments because those aren't extracted byNuExtract. However, since such threads don't have CRCs, there's reallynothing to test anyway. The parts that can be tested for correctness areverified automatically when the archive table of contents is read.
A single-record version of NuTest.
Get a pointer to the record header. The thread array can be accessedthrough this pointer.
As with callbacks, when you get a const pointer, it is very important thatyou don't try to modify it. The structure pointed to is part of thecurrent archive state, so the effects of changes are unpredictable. If youwish to alter fields in the Record header, use the NuSetRecordAttr call.
IMPORTANT: you must discard this pointer if you call NuFlush or NuClose.
Get the recordIdx for the first record in the archive whose case-insensitivefilename matches "name". The value retrieved can be used withany call that takes a NuThreadIdx argument.
The "name" string must match the record's filename exactly,including the filename separator character.
If you know what you want to extract from an archive by name, use this.
Get the recordIdx for nth record in the archive. "position"is zero-based, meaning the very first record in the archive is at position 0,the next is at position 1, and so on. The value retrieved can be used withany call that takes a NuRecordIdx argument.
This could be useful when an application is certain that it is onlyinterested in the very first record in the archive, e.g. an Apple II emulatoropening a disk image.
A streamingarchive is presented to the library as a FILE* that can't be seeked, generallybecause it was handed to the application via a pipe or shell redirect. Asubset of the ReadOnly interfaces are supported. All of them leave thestream pointed at the first byte past the end of the archive.
This calls are also useful for files on disk in situations where memory isat a premium. Because it's impossible to seek backwards in the archive, noattempt is made to remember anything about records other than the one mostrecently read.
The interfaces supported are:
There is one interface that only applies to StreamingReadOnly archives:
Creates a new NuArchive structure for "infp". The file mustbe positioned at the start of the archive.
It should be possible to concatenate multiple archives together, and use themby issuing consecutive NuStreamOpenRO calls.
If your system requires fopen(filename, "rb") instead of"r" (e.g. Win32), make sure the archive file was opened with"b", or you may get "unexpected EOF" complaints.
Open a file for read-write operations. A pointer to the new archive isreturned via "ppArchive".
"archivePathnameUNI" is the name of the archive to open. If thefile has zero length, the archive will be treated as if NufxLib had just createdit.
"tempPathnameUNI" is the name of the temp file to use. The callwill fail if the temp file already exists. The temp file must be in alocation that allows it to be renamed over the original archive when a"flush" operation has completed. If "tempPathname"ends in six 'X's, e.g. "tmpXXXXXX", the name will be treated as amktemp-style pattern, and a unique six-character string will be substitutedbefore the file is opened. Note that the temp file will be openedeven if "kNuValueModifyOrig" is set.
"flags" is a bit vector of boolean flags that affect how thearchive is opened. If no flags are set, and the archive doesn't exist, thecall will fail. If "kNuOpenCreat" is set, the archive will becreated if it doesn't exist. If "kNuOpenCreat" and "kNuOpenExcl"are both set, the call will fail if "archivePathname" already exists(i.e. the archive *must* be created).
If the archive was just created, "kNuValueModifyOrig" will be setto "true".
NufxLib can tell the difference between a BXY file (NuFX in a Binary IIwrapper) and a BNY file with several entries whose first entry happens to be a NuFXarchive. Access to BNY files that happen to have a ShrinkIt archive inthem isn't supported.
Commits all pending write operations.
"pStatusFlags" gets a bit vector of flags regarding the status ofthe archive. If a non-kNuErrNone result is returned, "pStatusFlags"may contain one or more of the following:
Some of the above are mutually exclusive, e.g. only one of kNuFlushSucceeded,kNuFlushAborted, and kNuFlushCorrupted will be set.
Any records without threads -- either created that way or having had allthreads deleted -- will be removed. Newly-created records without filenamethreads will have one added. (Existing records without filenames arefrowned upon but left alone.)
Normally, the archive is reconstructed in the temp file, and the temp file isrenamed over the original archive after all of the operations have completedsuccessfully. As a performance optimization, if kNuValueModifyOrig is"true", NuFlush will try to modify the archive in place. This isonly possible if the changes made to the archive consist entirely of additions of newfiles, updates to pre-sized threads, and/or setting record attributes. Ifother changes have been made, the update will be done through the temp file.
If an operation fails during the flush, all changes will be aborted. Ifsomething fails in a way that can't be recovered from, such as failing to renamethe temp file after a successful flush or failing partway through an update tothe original archive, the archive may be switched to read-only mode to preventfuture operations from compounding the problem.
Add a new record with no threads. The index of the created record isreturned in "pRecordIdx". This always creates a "version3" record, and expects that the filename will be stored in a thread.
"pFileDetails" is a pointer to a NuFileDetails structure. This contains most of the interesting fields in a record, such as access flags,dates, file types, and the filename. The "threadID" field isignored for this call.
"pRecordIdx" may be NULL. However, the only way to addthreads to the record is with NuAddThread, which requires the record index as aparameter, so you almost certainly want to get this value.
If no filename thread is added, the NuFlush call will use the "storageName"field from the "pFileDetails" parameter to create a filename threadfor it.
If no threads are added at all, the NuFlush call will throw the record away.
The "pFileDetails->storageName" may not start with the filename separatorargument, e.g. "/tmp/foo" is illegal but "tmp/foo" is okay.
If a disk image thread is added to the record, and the "storageType"and "extraType" values set by "pFileDetails" aren'tcompatible, the entries will be replacedwith values appropriate for the thread. For records with non-disk data-classthreads, the storageType will be adjusted when necessary.
Depending on the values of kNuValueAllowDuplicates and kNuValueHandleExisting,this may replace an existing record in the archive. SeeReplacingExisting Records and Files for details.
Add a new thread to a record. You may add threads to an existing recordor a newly created one. Some combinations of threads are not allowed, andwill cause an error to be returned. (See theNuFXAddendum for details.)
"recordIdx" is the index of the record being added to.
"threadID" is the class and kind of the thread being added. This defines how the data is labeled in the archive, and whether the contents of"pDataSource" are to be regarded as pre-sized or not.
"pDataSource" is where the data comes from. If the source isuncompressed, the thread will be compressed with the compression value currentlydefined by kNuValueDataCompression. (You can set the value independentlyfor each call to NuAddThread.) Only data-class threads will be compressed. If you're adding a pre-sized thread, such as a comment or filename, set the"otherLen" field in the data source.
"pThreadIdx" gets the thread index of the newly createdthread. This parameter may be set to NULL.
Threads will be arranged in an appropriate order that may not be the same asthe order in which NuAddThread was called.
If "threadID" indicates the thread is a disk image, then theuncompressed length must either be a multiple of 512 bytes, or must be equal to(recExtraType * recStorageType) in the record header.
On successful completion, the library takes ownership of "pDataSource". The structure will be freed after a NuFlush call completes successfully or allchanges are aborted. Until NuFlush or NuAbort completes, it is vital thatyou don't free the underlying resource. That is, don't close the FILE*, delete the file, orfree the buffer that the data source references. If you don't want to keeptrack of the resources used by FP and Buffer sources, you can specify "fcloseFunc"or "freeFunc" functions to have them released automatically. See the explanation ofNuDataSource for details.
Add a file to the archive. This is a combination of NuAddRecord andNuAddThread, but goes a little beyond that. If you add a file whosepFileDetails->threadID indicates a data fork, and another file whosepFileDetails->threadID indicates a resource fork, and both files have thesame pFileDetails->storageName, then the two files will be combined into asingle record.
"pathnameUNI" is how to open the file. It does not have anybearing on the filename stored in the archive. Because all writeoperations are deferred, NufxLib will not open or even test the existence of thefile before NuFlush is called.
"pFileDetails" describes the file types, dates, and access flagsassociated with the file, as well as the filename that will be stored in thearchive ("storageName"). If two forks are placed in the samerecord, whichever was added first will determine the record's characteristics.
"fromRsrcFork" should be set if NufxLib should get the data out ofthe "pathname" file's resource fork. If the underlyingfilesystem doesn't support resource forks, then the argument has noeffect. It does not have any impact on whether the data is stored as adata fork thread or resource fork thread -- that is decided by the "threadID"field of "pFileDetails".
"pRecordIdx" gets the record index of the new (or existing)record. This argument may be NULL.
The "pFileDetails->storageName" may not start with the filename separatorargument, i.e. "/tmp/foo" is illegal but "tmp/foo" is okay.
If "pFileDetails->threadID" indicates the thread is a diskimage, then the uncompressed length must either be a multiple of 512 bytes, ormust be equal to recExtraType * recStorageType.
On systems with forked files, such as GS/OS and Mac OS, it will be necessaryto call NuAddFile twice on forked files. The call will automatically joinforks with identical names.
Depending on the values of kNuValueAllowDuplicates and kNuValueHandleExisting,this may replace an existing record in the archive. SeeReplacingExisting Records and Files for details.
Adding a directory will not cause NufxLib to recursively descend through thedirectory hierarchy. That's the application's job. Requests to adddirectories are currently ignored. [A future release may add a"create directory" control thread, so we can store empty directories.]
Rename an existing record. Pass in the index of the record to update,the new name, and the filename separator character. Setting the name to anempty string is not permitted.
This call will do one of three things to the archive. If a filenamethread is present in the record, and it has enough room to hold the newfilename, then the existing thread will be updated. If a filename threadis present, but doesn't have enough space to hold the new name, then theexisting thread will be deleted and a new filename thread will be added. Finally, if no filename thread is present, a new one will be added, and thefilename in the record header (if one was set) will be dropped.
NufxLib does not currently test for the existence of records with anidentical name. This is probably a bug (ought to obey thekNuValueAllowDuplicates setting).
Set a record's attributes. The fields in the NuRecordAttr structreplace the fields in the record. This can be used to change filetypes,modification dates, access flags, and the file_sys_id field.
The changes become visible to NuContents calls only after NuFlush is called.
You can fill in values in the NuRecordAttr from a NuRecord struct with the NuRecordCopyAttrcall.
Update the contents of a pre-sized thread. This can only be used onfilename and comment threads. Attempting to use it on other threadsresults in a kNuErrNotPreSized return value.
"threadIdx" is the index of the thread to update, and "pDataSource"is where the data comes from. The "otherLen" field in "pDataSource"is ignored, because this call cannot be used to resize an existing thread. (The only way to do that is to delete the thread and then create a new one.)
"pMaxLen" will hold the maximum size of the thread if the callsucceeds. If the call fails because the existing thread is too small, kNuErrNotPreSizedis returned and "pMaxLen" will be valid. (You can also get thesize by examining the thread's thCompThreadEOF field.)
This cannot be used on newly-added, deleted, or updated threads.
On successful completion, the library takes ownership of "pDataSource". The structure will be freed after a NuFlush call completes successfully or allchanges are aborted. Until NuFlush or NuAbort completes, it is vital thatyou don't free the underlying resource. That is, don't close the FILE*, delete the file, orfree the buffer that the data source references. If you don't want to keeptrack of the resources used by FP and Buffer sources, you can specify "fcloseFunc"or "freeFunc" functions to have them closed automatically. See the explanation ofNuDataSource for details.
Bulk delete. This tries to delete every record in the archive, invokingthe SelectionFilter callback if one has been specified.
You cannot delete a record that is newly-added, has been modified, hasalready been deleted, or has had threads added, deleted, or updated. Suchrecords will be skipped over, so your selection filter simply won't see them.
Because deletion is a deferred write operation, none of the records willactually be deleted until NuFlush is called. If NuDelete was successful inits attempt to delete every record, and no new records were added, the NuFlushcall will mark the archive as being brand new (this differs from v1.0, whichfailed with kNuErrAllDeleted). As a result, if you close the empty archivewithout adding anything to it, the archive file will be removed.
Delete a single record, specified by record index.
You cannot delete a record that is newly-added, has been modified, hasalready been deleted, or has had threads added, deleted, or updated.
The record will be removed when NuFlush is called.
Delete a single thread, specified by thread index. If you delete all ofthe threads in a record, and don't add any new ones, the record will be removed.
You cannot delete a thread that is newly-added, deleted, or has been updated.
The thread will not be removed until NuFlush iscalled.
Closes thearchive. If the archive was opened read-write, any pending changes will beflushed first. If the flush attempt fails, NuClose will leave the archive open and returnwith an error.
When the archive is closed, the temp file associated with a read/writearchive will be closed and removed.
All data structures associated with the archive are freed. Attemptingto use "pArchive" further results in an error (or worse).
Abort all pending changes. NufxLib will throw out every pendingmodification request, returning to the state it was in following the most recentOpen or Flush.
This does not close or manipulate any files, except for those pointed to bydata sources with "fcloseFunc" set. For the most part it simply updates internaldata structures.
It's perfectly safe to call this if there are no pending changes. Thecall just returns without doing anything.
Get a pointer to the NuFX MasterHeader block. One useful item here isthe number of records in the archive.
IMPORTANT: do not retain the pointer after calling NuFlush or NuAbort.
Store an arbitrary void* pointer in the NuArchive structure. This canbe useful for accessing application data within a callback without resorting toglobal variables.
Manipulate one of NufxLib's configurable values. See thetablesfor details.
Get an archive attribute, such as whether it's wrapped in a Binary IIheader. See thetables for details.
Print debugging information to stdout. The output contains a ratherverbose description of the archive. This call is only functional if thelibrary was built with debugging enabled. If the library was built withoutassertions or debug messages, this call returns an error.
Sources and sinks provide a way for the application to add from and extractto something other than a named file on disk. There are three kinds ofsources and sinks:
NuDataSource objects are used in conjunction with deferred write calls. They specify a location from which data is read. All DataSource creation callstake the following arguments:
The remaining arguments are detailed next.
Create a data source from a file on disk. Because all write operationsare deferred, the file will not actually be opened until NuFlush iscalled. This means that if the file is unreadable or doesn't exist, thedata source create call will succeed, but the eventual NuFlush call will fail.
The entire contents of the file will be used. The file is opened whenneeded and closed when processing completes.
"pathnameUNI" is the name of the file to open. If you use thesame pathname with more than one data source, each data source will open andclose the file.
"isFromRsrcFork" determines whether the data fork or resource forkshould be opened. This only has meaning on systems like Mac OS and GS/OS, where the"open" call determines which fork is opened. For other systems,always set it to "false".
Create a data source from a FILE*. The FILE* must be seekable, i.e. youcan't use a stream like stdin. Because all write operations are deferred,any problems with the stream, such as an early EOF, will not be detected untilthe NuFlush call is made.
"fp" is the stream to use. It will be seeked immediatelybefore use, so it is permissible to use the same fp in more than one datasource. If you are developing for a system that differentiates betweenfopen(filename, "r") and fopen(filename, "rb"), use thelatter or you may get "unexpected EOF" failures.
"offset" is the starting offset in the file. The file will beseeked to this point right before it is used.
"length" is the number of bytes to use.
The "fcloseFunc" parameter points to a function that calls fclose()on its argument. It's bad practice (especially in the Win32 DLL world) toallocate in the app and free in the library, so this provides a way to let thelibrary choose when to close the file, but let the application manage its ownheap. If this argument is nil, the FILE* will not be closed whenprocessing on this data source completes.
IMPORTANT: if you use the same FILE* in more than one data source, do notprovide an fcloseFunc for any of them. Deferred write operations are notguaranteed to happen in any particular order, so if you set fcloseFunc the librarymay close the file when it is still needed.
Create a data source from a memory buffer. Invalid memory referenceswill not be detected until NuFlush is called.
"buffer" is a pointer to the memory you want to use. It isokay for "buffer" to be nil so long as "offset" and"length" are zero. This may be useful when creating an emptycomment thread.
"offset" is the offset from "buffer" at which the datastarts.
"length" is the number of bytes to use.
The "freeFunc" parameter points to a function that calls"free", "delete", or "delete[]" on its argument. There's no way for nufxlib to know exactly how the memory was allocated (malloc/new/new[]/custom),so the application needs to supply a function to clean it up. If thisargument is nil, the buffer will not be freed when processing on this datasource completes.. (Side note: the "offset" parameter exists so thatyou can use part of a buffer and then let the library free the whole thing afterward.)
IMPORTANT: if you use the same memory buffer in more than one data source, donot provide a freeFunc for any of them. Deferred write operations are notguaranteed to happen in any particular order.
Free a data source. You should only do this if the data source was notused in a successful deferred write call.
If "fcloseFunc" or "freeFunc" is set in the data source, the appropriate action willbe taken. (NufxLib may actually make copies of DataSource objects withref-counting, so freeing your object may not cause an immediate fclose or free.)
When the data source contains already-compressed data, there's no wayfor NufxLib to compute the CRC of the uncompressed data without expandingit. Version 3 records require a data CRC in the thread header. This provides a way for the application to specify what value shouldbe in the "thThreadCrc" field.
NuDataSink calls are used with the thread extraction function. Theyallow the application to specify where data is to be written to. AllDataSinkcreation calls take the following arguments:
The remaining arguments are detailed next.
Create a data sink for a named file on disk. The file will be opened,written to, and then closed.
Because of a peculiarity in NufxLib design, the OutputPathnameFilter callbackwill be invoked during the extraction if one has been installed. Sinceyour application supplied the filename, it most likely won't want to change it,but this can still be useful in the case where the file exists and needs to berenamed. (This might even be useful, e.g. if your application insists on usingthe record's filename directly when creating a data sink.)
"pathnameUNI" is the full pathname of the file to write to.
"fssep" is the filesystem separator used in the pathname. This is necessary so NufxLib can build any missing directory components.
Using the same pathname in more than one data sink will likely yielddisappointing results, as subsequent extractions will overwite the earlier ones.
Create a data sink from a FILE*. The stream must be writeable, and mustbe seeked to the desired offset before the extract call is made.
"fp" is the stream to use.
Using the same FILE* in more than one data sink isn't necessary: you can justre-use the same data sink. The stream is neverseeked, so subsequent extractions will append to the earlier ones.
Use a memory buffer as a data sink.
"buffer" is a pointer to the memory buffer.
"bufLen" is the maximum amount of data that the memory buffer canhold.
You can re-use a buffer data sink on multiple extractions. The pointerwill be advanced, and bufLen decreased. Exceeding the size of the buffercauses the extraction to fail with a buffer overrun error. (Thus, you canextract more than one thread into the same buffer, but you can't extract onethread into multiple buffers.)
Free a NuDataSink.
Get the number of bytes that have been written to a data sink. Theresult will be placed into "pOutCount". This can come in handyif you've extracted a number of things into a memory buffer and aren't sureexactly how much is in there (perhaps because of EOL conversions).
These functions allow you to set callbacks on a per-archive basis.
Most NufxLib calls are illegal in a callback function (NufxLib is notreentrant for a single NuArchive). The only calls you are allowed to makeare NuGetExtraData, NuSetExtraData, NuGetValue, NuSetValue, and NuGetAttr.
The application must not keep copies of pointers passed to a callback. If you want to keep the information from (say) a NuRecord*, you will need to copy thecontents of the struct to local storage.
If something has a "const" pointer, don't write to it. Theresults of doing so are unpredictable (but most likely bad).
All callbacks are of type NuCallback, which is defined as:
NuResult (*NuCallback)(NuArchive* pArchive, void* args);
The "set" functions return the previous callback, all of whichdefault to NULL. If the "pArchive" argument is invalid, thecalls will fail and return kNuInvalidCallback.
The selection filter callback is used to select records and threadsduring bulk operations. The argument passed into the callback is a "constNuSelectionProposal*
":
typedef struct NuSelectionProposal {const NuRecord* pRecord;const NuThread* pThread;} NuSelectionProposal;
These are pointers to the NuFX record and thread that we are about toact upon. During an extract operation, "pThread" will point at the threadwe are about to extract. During a delete operation, "pThread" will pointat the first thread in the record we are about to delete.
Valid return values from a selection filter:
If no selection filter is specified, then all records will be selected.
When extracting files, this callback allows you to change the name ofthe file that will be opened on disk. It will be called once for everythread we extract. The argument to the callback is a "NuPathnameProposal*
":
typedef struct NuPathnameProposal {const UNICHAR* pathnameUNI;UNICHAR filenameSeparator;const NuRecord* pRecord;const NuThread* pThread;const UNICHAR* newPathnameUNI;UNICHAR newFilenameSeparator;NuDataSink* newDataSink;} NuPathnameProposal;
The fields are:
If a record contains a data fork and a resource fork,your filter will be called twice with the same pathname. (You canexamine pThread to see what kind of fork is being extracted.) If theOS requires that extended files be initially created as such, then thefile will always be created as "extended" if the record indicates thata resource fork is present.
This mechanism can be used to implement a "rename file being extracted"feature. If an error handler is defined, and it returns kNuRename whenNufxLib tries to overwrite an existing file, then thepathname filter will be invoked again.
Valid return values from the output pathname filter:
If no OutputPathnameFilter is set, the files will be opened with thenames that appear in the archive.
During add, extract, and test operations, NufxLib will send progressupdate messages via the ProgressUpdater callback. The argument to thecallback is a "const NuProgressData*
":
typedef struct NuProgressData {NuOperation operation;NuProgressState state;short percentComplete;const UNICHAR* origPathnameUNI;const UNICHAR* pathnameUNI;const UNICHAR* filenameUNI;const NuRecord* pRecord;uint32_t uncompressedLength;uint32_t uncompressedProgress;struct {NuThreadFormat threadFormat;} compress;struct {uint32_t totalCompressedLength;uint32_t totalUncompressedLength;const NuThread* pThread;NuValue convertEOL;} expand;} NuProgressData;
The possible values for a NuOperation value are:
Deleting files and listing contents don't cause the progress updatecallback to be called, so you'll never see "kNuOpDelete" or "kNuOpContents"in a progress handler.The possible values for a NuProgressState value are:
Some values (say, kNuProgressCompressing) are only appropriate forcertain operations (kNuOpAdd).
Valid return values from a progress updater are:
If no ProgressUpdater is defined, no progress update information willbe sent.
The ErrorHandler callback deals with all exceptional conditions thatarise. The callback may define hard-coded policy or query the user fordirections. The argument to the callback is a "const NuErrorStatus*
":
typedef struct NuErrorStatus {NuOperation operation;NuError err;int sysErr;const UNICHAR* message;const NuRecord* pRecord;const UNICHAR* pathnameUNI;const void* origPathname;UNICHAR filenameSeparator;char canAbort;char canRetry;char canIgnore;char canSkip;char canRename;char canOverwrite;} NuErrorStatus;
Some situations that may arise:
The valid return values are defined by the NuErrorStatus structure.
If no ErrorHandler is defined, an appropriate default action (usuallykNuAbort) is taken.
Specify a callback to receive text error messages. These are typicallyan error message followed by an explanation of the error code that the libraryis about to return. The callback takes an argument of type "constNuErrorMessage*
", which is defined as:
typedef struct NuErrorMessage {const char* message;NuError err;short isDebug;const char* file;int line;const char* function;} NuErrorMessage;
The return value is ignored.
Some error messages aren't associated with an archive, generally because theyoccur when an archive is being opened. Since there's no way to associatethem with a single archive, the handler must be global to the entirelibrary. The second form of this call allows you to specify where globalerror messages should be sent. The arguments to the callback areidentical, but "pArchive" will be nil.
If no callback is specified, the messages are sent to stderr. If yourapplication doesn't have a stderr (perhaps it's a GUI application), be sure toset both the ErrorMessag and GlobalErrorMessage handlers.
Some of these are macros, some are functions. None require that anarchive be open.
Get some information about NufxLib's version. This sets the major andminor version numbers, as well as setting strings with the build date and somebuild flags.
Any or all of the arguments may be NULL, for values you aren't interested in.
The format of "ppBuildDate" is not defined [though it probablyshould be].
The format of "ppBuildFlags" is a string of compiler flagsseparated by white space (spaces or tabs). It is expected to represent an"interesting subset" of the flags sent to the compiler, such as thelevel of optimization used.
Return a pointer to a string describing a NufxLib error. NufxLib errorsare "err" values less than zero. "err" values greaterthan zero are system errors that can be processed with strerror() or perror(),and an "err" value of zero indicates success.
Test for support of an optional feature. See thetablesfor a list. Returns kNuErrNone on success, kNuErrUnsupFeature if thefeature is known but not supported, or kNuErrUnknownFeature if the feature isnot recognized at all (probably because the version of NufxLib you're linkedwith is older than what you compiled against).
Construct a NuThreadID, given a thread class and thread kind.
Construct a NuThreadID, using the thread class and thread kind defined in aNuThread.
Pull the thread class out of a NuThreadID.
Pull the thread kind out of a NuThreadID.
Pull the filename separator character out of the file_sys_info word.
Put the filename separator character into a file_sys_info word. Returns the new value.
Return the number of threads in a record.
Get the idx-th thread from pRecord. If idx is less than zero or pastthe end of the thread array, nil is returned.
Copy data from "pRecord" into "pRecordAttr". Onlythe fields that exist in a NuRecordAttr are copied. This can be useful inconjunction with the SetRecordAttr call.
Copy the thread array out of a record. This is useful if you want tokeep your own copy of a thread array.
Returns "true" if the threadID is considered pre-sized byNufxLib. Right now, only filenames and comments are given this treatment.
Convert Mac OS Roman to Unicode (UTF-8 or UTF-16). Returns the number of bytes required to hold the converted string. "bufUNI" may be NULL. [Not implemented for Win32.]
Convert Unicode to Mac OS Roman. Returns the number of bytes required to hold the converted string. "bufMOR" may be NULL. [Not implemented for Win32.]
kNuValueIgnoreCRC | Boolean (false). Don't verify header or data CRCs. This can provide a minor speed improvement, but allows certain kinds of errors to go undetected. |
kNuValueDataCompression | Enum (kNuCompressLZW2). Threads that can be compressed (i.e. data-class threads) will be compressed with the specified compression. Possible values are:
|
kNuValueDiscardWrapper | Boolean (false). If changes are made to the archive that cause a new copy to be reconstructed in the temp file, then when this is set to "true" any BXY, BSE, or SEA wrapper will be stripped off. This also causes any "junk" at the start of the file to be removed. |
kNuValueEOL | Enum (system-dependent). End-of-line marker appropriate for the current system. If EOL conversion is enabled, extracted files will be converted to this EOL value. Valid values are:
|
kNuValueConvertExtractedEOL | Enum (kNuConvertOff). This determines whether "bulk" extractions do EOL conversions. Possible values:
|
kNuValueOnlyUpdateOlder | Boolean (false). If set, only overwrite existing records and files if the item being added or extracted is newer than the one being replaced. Useful for an "update" or "freshen" option. The date used for comparison is the modification date. |
kNuValueAllowDuplicates | Boolean (false). If set to "true", duplicate records are allowed in the archive. If "false", the collision will be handled according to the kNuValueHandleExisting setting. Filename comparisons are case-insensitive. |
kNuValueHandleExisting | Enum (kNuMaybeOverwrite). This determines how duplicate filename collisions are handled. Valid values:
The case sensitivity when extracting is determined by the underlying filesystem. |
kNuValueModifyOrig | Boolean (false, unless the archive was just created by NufxLib). If this is "true", then an effort will be made to handle all updates in the original archive, rather than reconstructing the entire archive in a temp file. Updates to pre-sized threads, changes to record attributes, and additions of new files can all be made to the original archive. There is some risk of corruption if the flush fails, so use this with caution. |
kNuValueMimicSHK | Boolean (false). If set, attempt to mimic the behavior of ShrinkIt as closely as possible. See theShrinkIt Compatibility Mode section. |
kNuValueMaskDataless | Boolean (false). If set to "true", records without data threads have "fake" threads created for them, so that they appear as they would had they been created correctly. |
kNuValueStripHighASCII | Boolean (false). If set to "true", files filled with high-ASCII characters will be stripped if and only if an EOL conversion is performed. |
kNuValueJunkSkipMax | Integer (1024). If the archive file doesn't start with a recognized sequence, NufxLib will assume that some junk has been added to the start of the file and will scan forward at most this many bytes in an attempt to locate the real archive start. |
kNuValueIgnoreLZW2Len | Boolean (false). If set to "true", the length value embedded in LZW2 compressed chunks is ignored. This is useful for archives created with a specific broken application. (This is deprecated -- use HandleBadMac instead.) |
kNuValueHandleBadMac | Boolean (false). Recognize and handle "bad Mac" archives, which have a bad value ('?') for the filename separator character, and write an LZW/2 length value in big-endian order. |
kNuAttrArchiveType | Returns one of the following:
|
kNuAttrNumRecords | Returns the number of records in the archive. This value does not reflect unflushed changes. |
kNuAttrHeaderOffset | Returns the offset of the NuFX header from the start of the file. This will be nonzero for archives with a Binary II or self-extracting wrapper. |
kNuAttrJunkOffset | Returns the amount of junk found at the start of the file. A nonzero value here indicates that junk was found. |
kNuFeatureCompressSQ | Test for support of SQueeze compression |
kNuFeatureCompressLZW | Test for support of ShrinkIt LZW/1 and LZW/2 compression |
kNuFeatureCompressLZC | Test for support of 12- and 16-bit LZC |
kNuFeatureCompressDeflate | Test for support of zlib "deflate" |
kNuFeatureCompressBzip2 | Test for support of libbz2 "bzip2" |
When using NuAddFile or NuAddRecord, there are three flags that affectwhat happens when an existing record has the same name:
The AllowDuplicates flag determines whether or not wethink duplicate records are at all interesting. If an application sets it totrue, the floodgates are opened, and the two other flags are ignored.
The OnlyUpdateOlder flag is considered next. If it'sset to true, and an existing, identically named file in the archive appearsto be the same age or newer than the file being added, the record creationattempt fails with an error (kNuErrNotNewer).
The HandleExisting flag comes into play ifwe get past the first two. If a matching entry is found in the archive,NufxLib either deletes it and allows the add, prompts the user forinstructions, or rejects it with an error. NuAddFile and NuAddRecord willreturn with kNuErrRecordExists if they can't replace an existing record.If "must overwrite" is set, and a matching record does not exist, thenkNuErrDuplicateNotFound is returned.
Both AddFile and AddRecord check for duplicates among existing and newlyadded files. You aren't allowed to delete items that were just added, soHandleExisting flag is ignored for files you have marked for additionbut haven't yet flushed.
AddFile has an additional behavior that takes precedence over all ofthe flags: it will try to match up the individual forks of a file.If it finds a file in the newly-added list with thesame name and a compatible data thread, the new file will be added tothe existing record. (A "compatible" data thread is the other half of aforked file, e.g. the application added the data fork, and is now addingthe resource fork from the same file.) If the record was found but is notcompatible, the AllowDuplicates behavior is used to decide if another"new" record with the same name should be added, or if an error shouldbe returned.
If this sort of treatment is undesirable, i.e. youwant a data forkand a resource fork with the same filename to be stored as two separaterecords, then you should call AddRecord and AddThread. AddFile is meantas a convenience for common operations.
It is possible for NuAddFile and NuAddRecord to partially complete. Ifa record exists and is deleted, but the call later fails for some otherreason, the record will still be deleted.
Searching for existing records cantake time on a large archive. Disabling AllowDuplicates will allow NufxLibto avoid having to search through the lists of records to find matches.
When extracting files from an archive, the "OnlyUpdateOlder" and"HandleExisting" flags are applied to the files on disk. This is donemuch like the above.
To implement NuLib2's "update" feature, "OnlyUpdateOlder" needs to beset to true. To implement the "freshen" feature, "OnlyUpdateOlder"is set to true and "HandleExisting" is set to "must overwrite".
When extractions are done in bulk, the kNuErrDuplicateNotFound andkNuErrNotNewer errors are passed to the application's error handlerfunction. The error handler is expected to return kNuSkip after perhapsupdating the progress status message, but is allowed to abort or requireNufxLib to overwrite the file anyway. If no error handler is defined,the file is skipped silently.
One of the goals was to be as compatible with ShrinkIt as possible. ShrinkIt and GS/ShrinkIt occasionally do some strange things, so some of thecompatible behaviors are only activated when the "mimic ShrinkIt" flagis set.
These behaviors are:
Some GS/ShrinkIt behaviors are not fully emulated:
Regarding the last item: a quick test with a handful of emptyfiles showed that GS/ShrinkIt v1.1 failed to extract the empty files it had justarchived. P8 ShrinkIt v3.4 gets really confused on such archives, and insists that thefirst entry is a zero-byte disk archive, while the other empty files areactually four bytes long. When asked to extract the files, it doesnothing. When adding empty files, P8 ShrinkIt v3.4 does the correct thing, and creates an emptydata thread.
The default NufxLib behavior is to work around the bug. When extracting files with a filename but no data or control threads, a zero-byte data file will becreated. (In NufxLib v1.0 the default was to ignore such entries unlessthe "mimic" flag was enabled. This was changed in v1.1 to beenabled at all times. As of v1.2, an empty resource fork is also createdif the record's storage type indicates it's an extended file.) If the"MaskDataless" flag is enabled, fake data threads are created, andapplications won't even know there's a problem in the archive.
In general, with "mimic ShrinkIt" mode enabled, it should bepossible to extract files from a GS/ShrinkIt archive and re-add them to a newarchive, with little perceptible difference between the two. Of course, it'sup to the application to ensure that all threads (including comments) areretained, file dates aren't altered, and so on. The only situations whereNufxLib cannot produce identical results are bugs (e.g. zero-lengthdata files always require more space) and option lists (which NufxLib does not currently support).
The bottom line: it is perfectly normal for NufxLib archives to be a fewbytes smaller than GS/ShrinkIt archives, even when "mimic ShrinkIt"mode is enabled. (An example: my 20MB boot partition compressed to about14MB. With "mimic" mode off, the file was 13K smaller, or about0.1%.)
Of the various compression formats that NufxLib supports, only LZW/1 and LZW/2 arewidely supported. The latest versions of ShrinkIt and II Unshrinktry to unpack SQ compression but fail. Archives that use SQ, LZC12,and LZC16 can only be unpacked by GS/ShrinkIt, NuLib, and NufxLib v1.1 and later.
The "deflate" and "bzip2" algorithms are not supported by anything other than NufxLibv1.1+ and CiderPress.They are intended to be used with archives that will never beunpacked on an Apple II. Disk images compressed with these algorithms are especially useful withemulators that use NufxLib.
Some tests with deflate and bzip2 showed that, surprisingly, deflate isnearly always better than bzip2 for Apple II files. This is becausedeflate appears to do slightly better on machine code and small (< 32K) textfiles. Since most Apple II files and disk images fall into thesecategories, there is little advantage to using bzip2. Because deflate usesless memory and is faster, and libbz2 isn't nearly as ubiquitous as libz, I'vechosen to disable bzip2 by default.
You can use the NuFeatureTest call to test for the presence of any of thecompression algorithms. This makes it possible to build a library or DLLwithout LZW in it.
NufxLib v1.0 was developed under Solaris 2.5 and Red Hat Linux 6.0, and was portedto Win32 shortly before the alpha release. Porting to other UNIX-likeplatforms has been straightforward, with most differences contained in the"autoconf" configuration system. For example, the BeOS/PPC portwas largely a matter of getting the compiler settings right.
Mac OS and GS/OS have the ability to store file types and resource forksnatively. Support for this is not currently part of NufxLib. Adata-fork-only port, akin to what is used on UNIX and Win32, should bestraightforward though. (In fact, Mac OS X "just worked".)
Once upon a time a GS/OS port was imagined. This never happened, and likely never will.
The decision to pass FILE* structures instead of file descriptors wassomewhat arbitrary. The library uses buffered I/O internally, so it wasconvenient to have them passed in, rather than having an fd passed in and relyon the existence of an fdopen() call. On the other hand, if an applicationis built with a different version of the stdio library (in which the structure ofa FILE* are different), linking with NufxLib might not work. Given that NufxLibis distributed as BSD-licensed source code, I don't see this asbeing a major problem, since you can always rebuild NufxLib with the alteredstdio lib. (This caused me some grief under Windows, because the non-debugmultithreaded DLL version of libc apparently does something wonky with FILE* andfwrite. If the Win32 DLL and Win32 app aren't linked against the same libc,fwrite() will crash. Other versions of libc, e.g. debug multithreaded anddebug single-threaded, interact just fine with each other.) My conclusionafter fighting with Win32 is that it would have been better to pass filedescriptors or a "NuFILE*" with read/write/seek operations that residewholly within the NufxLib library.
The decision to pass data sources and sinks around as structures rather than asfunction pointers was born of a desire to reduce complexity. Setting up a datasource or sink requires making a function call with a lot of arguments, but oncethat's done you can forget all about it -- the code will happily close yourfiles and free your memory when you're done with it. A functionalinterface would require passing in read, write, close, and seek functions, whichgives the application more flexibility but essentially requires the applicationto implement its own version of the data source and sink structures. SinceNufxLib is intended for manipulating archives, not compressing streams of data,the added flexibility did not justify the cost. (I'm becoming less certainof that as time goes by. If I had it all to do again, I probably would usethe functional interface for all file accesses.)
It might have been useful to allow read/write/seek hooks for the archiveitself. The current architecture prevents you from processing an archivethat has been loaded into memory, unless you have memory-based FILE* streams in yourlibc. This became annoying during the development of CiderPress, because Iwanted to handle archives within a wrapper, such as ".shk.gz".
Returning a pointer to an allocated NuArchive structure worked pretty welluntil I wanted to set a parameter that affected the way open works (thejunk-skipping feature). Creating the structure in a separate call beforethe "open" wouldhave been better.
The biggest problem with NufxLib is that it tries to do too much. A muchlarger share of the work should have been done in NuLib2. The easiest way tosee this is to look at a feature like the built-in end-of-line conversion. It'scertainly handy to have, but applications like NuLib2 and CiderPress work withmore than one type of archive, which means we either won't support EOL conversions(e.g. NuLib2 doesn't convert when extracting from BNY) or we have toreimplement it in the app anyway (e.g. CiderPress has a complete copy of the NufxLibimplementation). Nearly everything relating to files should have been done bythe application, with NufxLib using a file descriptor or memory buffer toaccess the archive, and app-supplied read/write functions to handle compresseddata.
There is no real support for GS/OS option lists. The only place you'dever want to add these is on a IIgs, and I find it unlikely that NufxLib willedge out GS/ShrinkIt as the preferred archiver. (Besides, I question theirvalue even on a IIgs.) NufxLib will very carefully preserve them when modifyinga record, but there's no way to add, delete, or modify them directly.
The use of RecordIdx and ThreadIdx, rather than record filename and threadoffset number, was chosen for a number of reasons. The most important wasthat they are unambiguous. Consider that two records may have the samefilename, that one record may have two filenames, and that a record may have nofilename at all, and the need for RecordIdx becomes painfully clear. Icould've avoided ThreadIdx, by using a combination of RecordIdx and threadnumber, but after a few additions and deletions there is a clear advantage tohaving a unique identifier for every thread. Besides, it allowed me todesign calls like NuExtractThread so that they only had to take one identifieras an argument, reducing the amount of stuff an application needs to keep trackof, as well as the amount of error checking that has to be done in the library.
Allowing threads to be copied without expanding and recompressing them isneat, but if I 'd know how cluttered the interface would become I probablywouldn't have supported the feature. The NuDataSource calls are confusingenough as it is with the pre-sized thread stuff.
The "bulk" NuAdd interface can be cumbersome. When extractingyou can skip the bulk approach and handle filename conflicts yourself, but whenadding you don't have a good alternative if you're adding lots of files. You would have to follow every NuAdd with a NuFlush, which has some performanceproblems because (unless you configure the safety options off) NuFlush willwrite all data to the temp file and rotate it.
I chose not to implement EOL conversions when adding files. It's toopainful to do this in the library. It would be easier for the applicationto write an EOL-converted file into a temp file, then use the file add call onthat. (The "storage name" is set independently from the sourcefile name, so there's no problem with temp file names showing up in thearchive.) One approach that could be used within NufxLib would be to"pre-flight" the file by doing an EOL conversion pass to determine thefinal file length, then feed that length into the compression functions. The "NuStraw" interface would do the conversion transparently to thecompression routines.
The compression functions might have been better written with a zlib-likeAPI. This would have made it easier to extract the code and use it inother projects. The only disadvantage of doing so is that it adds a littleextra buffer copying overhead.
Some attention should have been paid to internationalization.
Some perhaps useful calls that weren't implemented:
A brief history of NufxLib releases. See "ChangeLog.txt" inthe sources for more detail.
Version | Date | Comments |
v0.0 | mid-1998 | Work begins |
v0.1 | 2000/01/17 | First version viewed by test volunteers. |
v0.5 | 2000/02/09 | Alpha test version. |
v0.6 | 2000/03/05 | Beta test version. |
v1.0 | 2000/05/18 | Initial release. |
v1.0.1 | 2000/05/22 | Added workaround for badly-formed archives. |
v1.1 | 2002/10/20 | Many new features, notably support for several compression formats. |
v2.0 | 2003/03/18 | Support for Win32 DLL features. |
v2.0.1 | 2003/10/16 | Added junk-skipping and a workaround for bad option lists; Mac OS X stuff from sheppy. |
v2.0.2 | 2004/03/10 | Handle zeroed MasterEOF, and correctly set permissions on "locked" files. |
v2.0.3 | 2004/10/11 | Fixed some obscure bugs that CiderPress was hitting. |
v2.1.0 | 2005/09/17 | Added kNuValueIgnoreLZW2Len. |
v2.1.1 | 2006/02/18 | Fix two minor bugs. |
v2.2.0 | 2007/02/19 | Switched to BSD license. Identify "bad Mac" archives automatically. |
v2.2.2 | 2014/10/30 | Updated build files, especially for Win32. Moved to github. |
v3.0.0 | 2015/01/09 | Source code overhaul. Added Unicode filename handling. |
v3.1.0 | 2017/09/21 | Improvements to Mac OS X attribute handling. Minor fixes. |
I'd like to thank Eric Shepherd for participating in some ping-pong e-mailsessions while I tried to get autoconf, BeOS, and some crufty versions of "make"figured out for v1.0.
This document is Copyright © 2000-2017 byAndyMcFadden. All Rights Reserved.
The latest version can be found on the NuLib web site athttp://www.nulib.com/.