Apple Archive Utility (and ditto) and very large ZIP archives
Dragan posted Dec 10th 2009 at 6:25PM
This post’s aim is to point your attention to some strange things you may expect if you use Apple built-in archiving tools Archive Utility (GUI application) and ditto (command line tool) to create “large” ZIP archives. By “large” I mean archives that exceeds regular ZIP limitations, so that the Zip64 format extension has to be used. In short, DO NOT USE these tools if you plan to share your archives with people using other operating systems or other archiving tools on Mac. They may not be able to open archives at all. For those who are more interested in the subject, here comes the long story…
First, let’s say something about the structure of a typical ZIP archive, and then define limitations of the standard ZIP specification and how those are overcome with Zip64 extensions. A very detailed specification of the ZIP format can be found here. I won’t describe all the details in it, as that would take too much time and space, I’ll concentrate only on the things important to realise why Apple archiving utilities fail.
The global structure of a ZIP archive is rather simple; first we have files contained within the archive stacked one after another. Every file consists of its local file header, immediately followed by file data (usually compressed using one of the common compression methods: Deflate, Deflate64, BZIP2…). This local file header contains quite some pieces of data describing the file, but here we’re interested in two particular fields, compressed size and uncompressed size. Both of these fields are 4 bytes (32 bits) long, which means the regular ZIP archive can store files which are not bigger than 4GB (4,294,967,295 bytes, or 2^31 – 1 bytes). So, we already know the first limitation of the standard ZIP format. After all the files, there come some other archive related data, not interesting for this story, and then at the end come central directory and end of central directory record. Now, central directory contains again file header of each file in the archive, but these headers are “extended” and called central directory file header. Each central directory file header contains more information than its corresponding local file header. Fore example, file comment and file attributes fields can be found in the central directory file header, but not in the local file header. But, the central directory doesn’t contain only extended file headers, it also contains some other archive related data, again not interesting for this story. And finally, we’ve got end of central directory record at the very end of the ZIP archive file. There are more fields in this last part, but the one of interest for us is total number of entries in the central directory. This field is 2 bytes (16 bits) long, implying the regular ZIP archive can have only 65,535 (2^16 – 1) files. This is the second limitation we care about for this particular story.
What happens it some of limitations of the standard ZIP are exceeded? Then the Zip64 extension should be used. It effectively adds two chunks of data to the archive, called zip64 end of central directory record and zip64 end of central directory locator. Presence of this additional information determines whether a ZIP archive is a standard one, or the one with Zip64 extensions. What is important is that If an archive is in Zip64 format, the compressed size and uncompressed size fields in both local file header and central directory file header (remember, they are only 4 bytes long) should contain value 0xFFFFFFFF, and the real sizes are written in so-called extra field in both local file header and central directory file header and that value is 8 bytes (64 bits) long (actually, this isn’t completely true, since the things with extra field are more complicated. Extra field is of variable size and a lot of information can be put into it. A part of that information is zip64 extended information extra field and again parts of that piece are original uncompressed file size and size of compressed data, which are 8 bytes long. Off course, zip64 extended information extra field in file headers is present only if above mentioned zip64 end of central directory record and zip64 end of central directory locator are present in the archive). So now, we know an archive in Zip64 standard can contain up to 2^64 – 1 files. Also, like the normal end of central directory record has its field total number of entries in the central directory, which is only 2 bytes long (allowing only for 65535 files in archive), zip64 end of central directory record has its own field named total number of entries in the central directory, which is 8 bytes long, thus allowing up to 2^64 -1 files in the ZIP archive in Zip64 standard.
To make this whole story more complete (and probably clear) here is the comparison of limitations of standard and Zip64 archives (the table includes some other limitations, not mentioned above):
| Attribute | Standard Format | Zip64 Format |
| Number of Files Inside an Archive | 65,535 | 2^64 – 1 |
| Size of a File Inside an Archive (bytes) | 4,294,967,295 | 2^64 – 1 |
| Size of an Archive (bytes) | 4,294,967,295 | 2^64 – 1 |
| Number of Segments in a Segmented Archive | 999 (spanning), 65,535 (splitting) | 4,294,967,295 – 1 |
| Central Directory Size (bytes) | 4,294,967,295 | 2^64 – 1 |
Now, what Apple tools do? As far as any of limitations of the standard zip format is not reached, everything is fine. But once at least one of those limitations is reached, a proper tool should automatically switch to Zip64 standard, add zip64 end of central directory record and zip64 end of central directory locator (with relevant correct information) to the archive, start putting information about file sizes in extra field > zip64 extended information extra field > original uncompressed file size and extra field > zip64 extended information extra field > size of compressed data and fill compressed size and uncompressed size of file headers with 0xFFFFFFFF. Instead of that, Apple archiving tools continue populating archive with new files like it’s standard ZIP archive, keeping on stacking new files one after another. This implies no presence of Zip64 information at all and incorrect information about file sizes and number of files inside archive. We may rightly say the archive is corrupted (although all important data of all files archived are still present).
What happens when you encounter such an ZIP archive depends on the tool you use to process it. We can identify three base cases here:
1. The number of files in archive is greater than 65535, but all files are still smaller than 4GB and the total archive size is less than 4GB.
If the tool doesn’t use information about the archive stored in the central directory, but just enumerates through all the files (which I believe is wrong thing to do), then you’ll be able to open the archive and extract all files from it. Apple archiving utilities behave this way, as well as many archiving tools for Mac which run 7-zip command line tool in the background (since 7-zip behaves the same, it just enumerates through the files).
If the tool use information stored in the central directory (which I believe is right thing to do, that’s the sole purpose of central directory existence), then you won’t be able to see all the files in the archive. Most likely the tool will see the modulus of 65536 of the real value. For example, if there are 70000 files in the archive, the tool will see only the first 70000 – 65536 = 4464 files and extract them. All other files are unreachable by the tool (although they are in the archive). If you have an access to a Windows box with WinZip installed, you can confirm this very easily; create such an archive (70000 small files) with Apple Archive Utility, open it with WInZip and see how many files are (reported) there.
2. The archive total size is greater than 4GB, but sizes of each individual file is less than 4GB.
If the tool doesn’t use information about the archive stored in the central directory, but just enumerates through all the files, I assume all files will be reachable and possible to extract. I didn’t try this myself, but I don’t see the reason why they would behave differently.
If the tool use information stored in the central directory, I don’t exactly know what would happen, since I didn’t try it myself. It may be that the tool would open and extract the archive without any problems, but it may very well be that the tool would report archive being corrupted and wouldn’t open it at all.
3. At least one file in the archive is greater than 4GB.
I didn’t try, but I assume the tool that doesn’t use information about the archive stored in the central directory, but just enumerates through all the files, would be able to open an archive, but extraction would most probably fail, since the file size and its offset in the archive are reported incorrectly, so the tool may try to look for the file on the wrong place and expect to extract/decompress wrong number of bytes.
If the tool use information stored in the central directory, it won’t open the archive and will report it being corrupted, since reported file sizes and file offsets don’t match.
Off course, you can make various combinations of the above common cases, with combined resulting behaviour.
As a conclusion to this long and hopefully informative story, I’ll repeat once more: if you plan to share your big archives (more than 65535 files or/and archive bigger than 4GB or/and file in archive bigger than 4GB) with people using other operating systems or other archiving tools on Mac, DO NOT USE Apple archiving tools built into the Mac OS X. I’d also like to point out that Springy uses central directory to open and gather information about ZIP archive and files inside, but with built in tricks and workarounds, just to be able to process faulty archives made by Apple tools. Even more, if you open such an archive and modify it, Springy will automatically “fix” it into a proper Zip64 archive!
Apple is aware of all these bugs, I reported them quite some time ago. Unfortunately, nothing has happen since, the status of the bugs is still “open”.
Dragan
Springy 1.6 released, immediately followed by 1.6.1
Dragan posted Nov 19th 2009 at 11:50PM
Hello everybody,
As you probably can notice, there were two versions of Springy released in two consecutive days! The version 1.6 was probably the busiest release since Springy started its life as a shareware application. Two days full of application building, disk image making, downloading and uploading, answering e-mail and forum messages, fixing bugs and than all over again.
First, version 1.6 was released yesterday at around 18:00 CET. I just post information on couple of relevant Mac sites and than had to go to finish some other stuff. That also took part this morning, and just as I finished and started answering some forum messages, Chris has discovered a very nasty bug, which needed immediate fixing. Luckily, the fix was easy, but it took a while to do the whole release cycle one more time before the version 1.6.1 became publicly available. During that time, version 1.6 became unavailable for download. I didn’t want to put anyone at risk of corrupting some important data with the buggy application.
Now when it’s all (hopefully) over, let’s talk about what really is new (bugs excluded :-)) in version 1.6 (and 1.6.1). The full list of news you can find at the version history page and here I’d like to stress some I find most important. But before doing that, I want to mention one thing I did NOT do: Springy help system was not updated. I had so many things to do in a short period of time and the help system is getting fatter and fatter with all new options Springy gets. It’d take me additional couple of days to prepare it, but I didn’t want to wait that long. Users have been waiting for the version which will provide Services Menu in Snow Leopard for too long already and I believe they prefer to get it even with old help system, then to wait some more. Off course, the help system will be updated to explain all new features and I hope it will see the light of a day in version 1.6.2 or 1.6.3.
And now, let’s talk about news!
Services Menu, off course! In Snow Leopard this is the preferred way of doing things which were usually done with the contextual menu plug-ins in earlier OS versions. More about how to use Springy Services can be read here. I’ll just tell you how to enable them, since you have to do it manually. Launch Springy application at least once (or you can logout/login, or reboot, but I assume just launching Springy is more quick and straightforward) and it will register its services with the system. Then, launch System Preferences app and go to System preferences > Keyboard > Keyboard Shortcuts > Services:

In section “Files and Folders” you will see many new services provided by Springy. Check the one you want to use and off you go. You may think some archiving services are duplicated but a more careful look reveals a subtle difference between the two: one is without the so-called ellipse character (...), the other one has it. The first one immediately starts archiving selected files using default archiving parameters from user preferences, while the latter one presents a panel, where you set up archiving parameters to be used for that particular archiving task. I know this configuration looks a bit strange (and possibly ugly). I’d much rather have it solved by having only one item, which dynamically changes its look and behaviour when pressing a modifier key, the same way as implemented in SpringyCM contextual menu plug-in. Unfortunately, Services Menu architecture doesn’t allow for any dynamism. I’ve made a request to Apple for such behaviour some time ago, we’ll see whether they will provide any. Changing the behaviour of the menu item with a modifier key is possible even now, but not the way it’s presented to the user. In order to avoid more confusion (questions like “why does the archiving parameters panel appear sometimes, and not some other times?”), I decided to go with two separate menu items, until some more dynamism is provided by Apple. I you have some better idea how to solve this, I’d like to hear it.
Direct extraction mode is another great feature many users were asking for. When in this mode, Springy doesn’t open an archive the regular way for browsing, but immediately starts extracting all files from it, similar as Mac OS X built-in Archive Utility does. If Springy is set as a default handler for a particular archive and it works in direct extraction mode, a double-click on archive file will start immediate extraction. This doesn’t apply to double-clicking the archive file only, it also applies to all possible ways of opening a file: drag & drop onto springy application icon in Finder or Dock, using Open and Open With Finder contextual menu items, etc. Whether Springy works by default in Direct Extraction Mode or Open for Browsing mode is easily set in Springy > General preferences:

Default behaviour can be overruled with fn modifier key in Leopard and Snow Leopard (*cmnd* key is used in Tiger). Further fine tuning of direct extraction is possible with settings in Springy > Extracting preferences:

You can choose whether direct extraction starts immediately in the same parent folder where the archive is, or an open panel is presented, giving you possibility to select destination location. Again, default behaviour can be overruled using alt modifier key.
Support for 7-zip archives! Finally some support, at least. At the moment, Springy is only able to open/read/extract 7Z archives, but full support will come during the 1.6.x upgrading cycle. I’ve already written about hurdles of implementing 7-zip support, so I won’t repeat it here. I’ll only say it takes this long because I really want it implemented in a good way, which also takes care of good error handling. For example, if you have a corrupted 7Z archive, Springy will clearly tell you what is wrong. Or if you need to supply a password to extract encrypted files from a 7Z archive, and you mistype it, Springy will go back and ask you again for the password for the same file. Other tools available on Mac would just abort extraction and say the password may be incorrect, etc…
The same code used to handle 7Z archives can also process Microsoft CAB archives, so in an unlikely event of stumbling upon a CAB archive you need to browse and extract, Springy will serve you properly.
Flat List View, This is like having the normal list view, but than without any hierarchy, just a list of all files in an archive. This is how WinZip presents archive contents. Personally I don’t like it at all, but there seems to be many switchers from Windows to Mac who requested me to make something similar. I hope this will make them happy:

I’ll leave to you to discover other new features and improvements yourself. In a next few days, while the work on updated help system is in progress, I’ll make appropriate F.A.Q. page which will briefly explain some of new things and how to use them.
Enjoy the new version!
Dragan
Springy Services (SpringyCM life after death)
Dragan posted Oct 4th 2009 at 9:25PM
In the previous blog post I explained in detail the full life cycle of the Springy contextual menu plug-in. The reason it will start dying slowly and finally pass away with support for Leopard is Apple’s decision to kill contextual menu plug-ins (together with other means of loading arbitrary code into 64-bit Cocoa applications) in Snow Leopard. This effectively prevents Finder in Snow Leopard (which is a full 64-bit Cocoa application) from using third party contextual menu plug-ins.
Apple suggests using of system Services Menu as a replacement for CM plug-ins, so I decided to go that route and not to try to invent some unsupported way of loading CM code into Finder, which could break with every OS update. Services Menu has undergone complete overhaul in Snow Leopard. It’s much more configurable now and more pleasant to use. Also, the OS provides Services Menu items on more places, not only in the standard, well known, Services Menu.
Springy will become so-called “services provider” application in version 1.6. It will offer its services not only to Finder, but also to every (Cocoa) application which can provide file names list in the pasteboard for Services system to use. I didn’t check it myself, but this people will enable using of Springy commands from other file managers (like Path Finder and ForkLift), even from other applications, such as Apple Mail for example, where you’ll be able to invoke Springy commands on e-mail message attachments.
Now, the time has come for small show-off. The following shots show different ways Springy commands will be invoked:
This one shows Springy commands in the Services Menu in Finder:

Here, you see Springy commands in the menu of the Action button in Finder:

And off course, Snow Leopard provides showing for Services Menu items in a contextual menu, so you can see it here:

When any of these actions is selected, Springy application will start, but not in its normal mode (window which shows the contents of an archive), but in “services provider mode”, with the UI similar to that of CM plug-in:

Upon finishing a task, Springy application will automatically terminate, unless some other processing is ongoing or there is at least one open archive with window showing its contents.
As you can see, you’ll be able to use Springy application in “services provider mode” just like old SpringyCM plug-in. But you can also find its command elsewhere in the system, and all this flexibility is provided by the OS.
Unfortunately, like stated here, there will be no more possibility to browse archive contents using hierarchical contextual menu. Implementing this would require much more dynamic nature of the Services Menu architecture. I’ve already filled a bug report/request for Apple to provide such architecture, hopefully they will listen. Some users have already said to me that while they found CM-browsing of archives very impressive, they personally did not find it as fundamental as the other functionality that Springy will continue to be able to provide in Snow Leopard. I hope most users feel the same way, but I’m pretty sure some will be very disappointed by this. I myself, am very disappointed and I really hope something will change in that regard very soon.
One unanswered question is when Springy 1.6, with all this goodies, will be publicly available? My intention is to release it at the very beginning of November. The only thing preventing me of doing it sooner is fixing some bugs and improving some aspects of reading and extracting of 7Z archives. Yes, in version 1.6 Springy will finally offer some kind of support for 7Z archives! In the beginning, this support will include only opening/browsing/extracting files from 7Z archives, but during 1.6.x development cycle all other operations on 7Z archives should be in place for users to try them out.
I hope you all are looking forward to release of Springy 1.6, especially those who have already switched to Snow Leopard.
Dragan