The development of Springy 2.0 has started

Dragan posted Mar 1st 2010 at 2:40PM

The recent period, since the unfortunate event described here, was very frenzy for me, both on professional and private front. But that time wasn’t spend for nothing. I set-up the necessary infrastructure for the new Springy development; the whole Development folder (where I keep all my projects, sources, documentation and other development related stuff), while being regularly backed-up to in-house backup storage using Time Machine, will also be backed-up on a daily basis to TWO remote servers/cloud services. Also, every local Git repository will also have its copy on TWO remote Git repository servers, which will be updated (commit performed) also on a daily basis. All this may seem a bit paranoid, but I certainly don’t want to loose any important data again and in that regard I don’t mind some extra precautions. I assume I’ve learnt my lesson good.

Moreover, even the initial development of Springy 2.0 has started. As I mentioned in the previous blog post, this will be a good opportunity to some refactoring and redesigning of many things for which I haven’t had time before, as I was constantly tempted to add new features. I started from the ground-up, implementing faster and more robust automatic archive type and format recognition. Concrete results of this will be available very soon, in a form of Spotlight Importer plug-in which will enable you to index and search for files inside archives and disk images. This importer will provide significant improvements over existing ones on the market (including Apple’s Spotlight support for ZIP, TAR and CPIO archives in Snow Leopard), as it will support much wider range of archive types and formats (DMG and ISO images included), as well as all different archive specifics, such are for example Zip64 extensions, file encryption, multi-segmented archives, etc.

Soon after the Spotlight Importer plug-in, in parallel with continued development of Springy, I plan to release a QuickLook plug-in for quick preview of contents of wide range of archive and disk images types. There are a couple of these QL plug-ins in the market, but I think this one will be a bit special in the way it shows and browses archive contents and I hope many people will like it better over the existing ones.

With all these activities I hope to assure all current and future Springy users that I’m pretty serious about its development and intention to deliver version 2.0 somewhere in the spring – summer period next year. I’ll keep you all updated about my progress both here (with less frequent, but more descriptive and elaborative posts) and on Twitter account springyapp (more frequent, but short announcements lacking further details).

Dragan

 


 

Sad current state of development of Springy

Dragan posted Feb 5th 2010 at 8:00PM

The cause for this blog post is not a pleasant one. Not pleasant at all. Its main goal is to inform you about the near future of the Springy project, but also to admit that I didn’t seem to be quite serious about it, at least not when it comes to keeping its sources and resources safe enough.

About one and a half month ago I had burglars “visiting” my home on a nice December, Saturday evening, while I was out. Among other things, they stole two laptops, including my MBP, which was my primary development machine. But they also stole external hard disk hosting my Time Machine backup, as well as one pretty ugly and unattractive small PC box running FreeBSD, which served (among other things) as my Git repository server. I haven’t had the latest versions of Springy source code anywhere outside my home on any distant server. As a result of all this trouble, I don’t posses the source code for Springy any more! This also means that at the moment I cannot maintain it properly!

I didn’t rush to inform about this tragic incident as I wanted to see if there were any chances for the police to get my things back, or at least to plea to the burglars to somehow give my important data back. But now, after more than a month, I’m pretty sure there’s nothing that can be done about it. All Springy source code is gone for good.

I also spent some time browsing some older backups hopping I might find something useful. It appeared the latest Springy source code backup I had outside my home was that of a version 1.3.3, just before I switched to Leopard and started using Time Machine and Git as a version control system. As much as that old source code is useful, that’s far from enough to be able to continue smooth development of Springy; many features have been added meanwhile, and possibly even more things were completely redesigned, improved and overly changed. I may recover some of those by trying to remember what I did, browsing some old sketches made on paper and trying various reverse engineering techniques, but I don’t see it as a very useful and eventually successful process. Therefore my general conclusion is that the complete source code for the Springy project is irreversibly gone.

This whole story leaves the question about the future of Springy application. I have spent quite some time recently thinking about how to proceed and whether to do so at all. The idea of redoing everything and going through all known problems over again wasn’t very appealing at all. On the other hand, I think Springy is a very good archiving utility, with the right concept defined from the very beginning (everything must be done inside the application itself, no command line tools running inside NSTask and doing the real processing in the background, like so many other archiving utilities on Mac do), and decent implementation. Off course, it can always be better and the whole process of developing Springy was continuous improvements of the implementation, as my coding skills got hopefully better. Not to be neglected, the number of people who purchased it already and who probably use it on a daily basis for the tasks very important to them, they expect a decent support in case they encounter some problem, or find some bug. And only for those reasons, and for the fact I really enjoyed developing Springy, I decided to start it all over again and continue its development.

Unfortunately, the first results of that decision won’t be available soon. There are more reasons for that. The main one is that Springy was side job/hobby project from the very beginning. I have full-time job (unfortunately, it has nothing to do with Mac software development) and a private life to live, so the time I can dedicate to Springy is rather limited. It’s usually 12 – 22 hours per week, depending on a situation. This was also the reason the development of Springy so far wasn’t as fast as it could’ve been. I’d really like to transform Springy into my main job and dedicate full-time to it, but at the moment the revenue I get from it is far, far from what I need to pay my bills and still live a normal life. If at some point this situation changes, I’ll definitely consider my options, but I don’t see it happening in any near future.

The second main reason why the first results of new Springy development won’t become available any time soon is that I don’t want to just redo it from existing code (version 1.3.3) and add some more things on top of it. Even while developing/maintaining the current version, I’ve already wanted to change, redesign and refractor much of existing code. I learned a lot of things in the process, but Apple has also come up with a lot of new technologies in the meantime, and as I’ve tried them on some test projects, I’ve never had enough time to fully embrace them and incorporate into Springy code. While it will require probably less code to write, it doesn’t automatically mean less time as I really need to fully incorporate those technologies and since I haven’t done it before, I do expect to have some problems, questions and doubts over the correct way of doing things.

Taking all this into account, I don’t expect the new version of Springy to appear before mid 2011. Normally (with full-time dedication), it’d take a couple of months, but due to all the things explained above, the time frame expands considerably. Knowing this, it’s good to make a comment of the current status of Springy, as of version 1.6.1. The current implementation seems to be pretty good and stable, at least judging from the amount and kind of user complaints and bug reports. The most commonly reported problem is that Springy application sometimes crashes for some users while browsing archive and it can be traced down to icon preview generation. As that feature can be switched off very easily, I don’t see it as a really big problem (more about this you can read here and here). But there are some bugs which directly affect the functionality and they deserve a mention here:

1. This bug is related to SpringyCM contextual menu plug-in, for those users still running Leopard and Tiger. When a ZIP archive is extracted using SpringyCM, the plug-in doesn’t properly clean all of its traces afterwards (a handle to the archive file stays open). The consequence is that one cannot easily delete extracted archive by moving it to the trash and emptying the trash. The system will report that the file is still “in use”. Doing “Secure Empty Trash…” in FInder will however close the open handle and permanently delete a file. I’d rate this bug more as an annoyance, as it doesn’t really cripple the functionality of Springy in any way.

2. This one is more severe; ISO disk images can’t be created using Springy application, only using SpringyCM plug-in in Leopard and Tiger, and “Archive in ISO” service form the System Services menu in Snow Leopard. If one tries to select “ISO Image” from the pop-up button in save dialog in application, Springy will actually create an archive of type of the previous selection! This came up as a consequence of some NIB file editing I did in the last moments for version 1.6 and which I apparently didn’t test good enough. I know probably not many people use Springy to create ISO disk images, but still this bug is quite disturbing. Unfortunately, nothing can be done before the new version comes out. For those who need to create ISO images, please use “Archive in ISO” service form the System Services menu in Snow Leopard or SpringyCM plug-in in Leopard and Tiger.

3. It seems creating ZIP archive using “Archive in ZIP” service from the System Services menu (Snow Leopard only) is not completely reliable! For some reason, only sometimes, the preparation phase of archiving process is not done completely so some files are missing in the archiving phase. The consequence is the resulting archive may not contain all the files you’d expect it to. I really don’t have any clue why this happens. It was reported to me by a user and I managed to reproduce it myself. It doesn’t happen always and is very inconsistent in behaviour. For example, I’ve managed to correctly archive a bunch of files this way, than I did some other processing and then tried to repeat archiving of the same bunch of files again and it failed (some files were missing from the resulting archive). This can be very annoying, but I won’t be able to do anything about it before the new version comes out. If you find yourself in a similar situation and encounter this behaviour, please use Springy application directly (not “Archive in ZIP” service menu) to create ZIP archives. For other archive types, everything works as it should.

I really hope people will be able to live with these bugs, as I really can’t do anything about them in a short notice for all the reasons explained above.

I’ll post regularly over here about my progress with the new development of Springy. I’ve also created a new Twitter account springyapp (the name springy is already taken). While I’m not very much of a “twittering” person, I’ll try to use the account to update interested people about the progress more often, but in very short messages.

Needless to say, I’ve already started making full Time Machine backup to external hard drive again, but also I now make regular daily backups of selected folders to two remote servers, as well as keeping my Git repositories locally but also on two remote Git repository servers. It’s a pity I started appreciating all this precautions the hard way, once I lost all my source code, photos, music, movies, mails and many other very important documents and files. But it’s still better ever than never. I hope some people reading this may learn from my experience and prevent anything similar happening to them.

Dragan

 


 

Apple Archive Utility (and ditto) and very large ZIP archives

Dragan posted Dec 10th 2009 at 6:25PM

This post’s aim is to point your attention to some strange things you may expect if you use Apple built-in archiving tools Archive Utility (GUI application) and ditto (command line tool) to create “large” ZIP archives. By “large” I mean archives that exceeds regular ZIP limitations, so that the Zip64 format extension has to be used. In short, DO NOT USE these tools if you plan to share your archives with people using other operating systems or other archiving tools on Mac. They may not be able to open archives at all. For those who are more interested in the subject, here comes the long story…

First, let’s say something about the structure of a typical ZIP archive, and then define limitations of the standard ZIP specification and how those are overcome with Zip64 extensions. A very detailed specification of the ZIP format can be found here. I won’t describe all the details in it, as that would take too much time and space, I’ll concentrate only on the things important to realise why Apple archiving utilities fail.

The global structure of a ZIP archive is rather simple; first we have files contained within the archive stacked one after another. Every file consists of its local file header, immediately followed by file data (usually compressed using one of the common compression methods: Deflate, Deflate64, BZIP2…). This local file header contains quite some pieces of data describing the file, but here we’re interested in two particular fields, compressed size and uncompressed size. Both of these fields are 4 bytes (32 bits) long, which means the regular ZIP archive can store files which are not bigger than 4GB (4,294,967,295 bytes, or 2^31 – 1 bytes). So, we already know the first limitation of the standard ZIP format. After all the files, there come some other archive related data, not interesting for this story, and then at the end come central directory and end of central directory record. Now, central directory contains again file header of each file in the archive, but these headers are “extended” and called central directory file header. Each central directory file header contains more information than its corresponding local file header. Fore example, file comment and file attributes fields can be found in the central directory file header, but not in the local file header. But, the central directory doesn’t contain only extended file headers, it also contains some other archive related data, again not interesting for this story. And finally, we’ve got end of central directory record at the very end of the ZIP archive file. There are more fields in this last part, but the one of interest for us is total number of entries in the central directory. This field is 2 bytes (16 bits) long, implying the regular ZIP archive can have only 65,535 (2^16 – 1) files. This is the second limitation we care about for this particular story.

What happens it some of limitations of the standard ZIP are exceeded? Then the Zip64 extension should be used. It effectively adds two chunks of data to the archive, called zip64 end of central directory record and zip64 end of central directory locator. Presence of this additional information determines whether a ZIP archive is a standard one, or the one with Zip64 extensions. What is important is that If an archive is in Zip64 format, the compressed size and uncompressed size fields in both local file header and central directory file header (remember, they are only 4 bytes long) should contain value 0xFFFFFFFF, and the real sizes are written in so-called extra field in both local file header and central directory file header and that value is 8 bytes (64 bits) long (actually, this isn’t completely true, since the things with extra field are more complicated. Extra field is of variable size and a lot of information can be put into it. A part of that information is zip64 extended information extra field and again parts of that piece are original uncompressed file size and size of compressed data, which are 8 bytes long. Off course, zip64 extended information extra field in file headers is present only if above mentioned zip64 end of central directory record and zip64 end of central directory locator are present in the archive). So now, we know an archive in Zip64 standard can contain up to 2^64 – 1 files. Also, like the normal end of central directory record has its field total number of entries in the central directory, which is only 2 bytes long (allowing only for 65535 files in archive), zip64 end of central directory record has its own field named total number of entries in the central directory, which is 8 bytes long, thus allowing up to 2^64 -1 files in the ZIP archive in Zip64 standard.

To make this whole story more complete (and probably clear) here is the comparison of limitations of standard and Zip64 archives (the table includes some other limitations, not mentioned above):

Attribute Standard Format Zip64 Format
Number of Files Inside an Archive 65,535 2^64 – 1
Size of a File Inside an Archive (bytes) 4,294,967,295 2^64 – 1
Size of an Archive (bytes) 4,294,967,295 2^64 – 1
Number of Segments in a Segmented Archive 999 (spanning), 65,535 (splitting) 4,294,967,295 – 1
Central Directory Size (bytes) 4,294,967,295 2^64 – 1

Now, what Apple tools do? As far as any of limitations of the standard zip format is not reached, everything is fine. But once at least one of those limitations is reached, a proper tool should automatically switch to Zip64 standard, add zip64 end of central directory record and zip64 end of central directory locator (with relevant correct information) to the archive, start putting information about file sizes in extra field > zip64 extended information extra field > original uncompressed file size and extra field > zip64 extended information extra field > size of compressed data and fill compressed size and uncompressed size of file headers with 0xFFFFFFFF. Instead of that, Apple archiving tools continue populating archive with new files like it’s standard ZIP archive, keeping on stacking new files one after another. This implies no presence of Zip64 information at all and incorrect information about file sizes and number of files inside archive. We may rightly say the archive is corrupted (although all important data of all files archived are still present).

What happens when you encounter such an ZIP archive depends on the tool you use to process it. We can identify three base cases here:

1. The number of files in archive is greater than 65535, but all files are still smaller than 4GB and the total archive size is less than 4GB.
If the tool doesn’t use information about the archive stored in the central directory, but just enumerates through all the files (which I believe is wrong thing to do), then you’ll be able to open the archive and extract all files from it. Apple archiving utilities behave this way, as well as many archiving tools for Mac which run 7-zip command line tool in the background (since 7-zip behaves the same, it just enumerates through the files).
If the tool use information stored in the central directory (which I believe is right thing to do, that’s the sole purpose of central directory existence), then you won’t be able to see all the files in the archive. Most likely the tool will see the modulus of 65536 of the real value. For example, if there are 70000 files in the archive, the tool will see only the first 70000 – 65536 = 4464 files and extract them. All other files are unreachable by the tool (although they are in the archive). If you have an access to a Windows box with WinZip installed, you can confirm this very easily; create such an archive (70000 small files) with Apple Archive Utility, open it with WInZip and see how many files are (reported) there.

2. The archive total size is greater than 4GB, but sizes of each individual file is less than 4GB.
If the tool doesn’t use information about the archive stored in the central directory, but just enumerates through all the files, I assume all files will be reachable and possible to extract. I didn’t try this myself, but I don’t see the reason why they would behave differently.
If the tool use information stored in the central directory, I don’t exactly know what would happen, since I didn’t try it myself. It may be that the tool would open and extract the archive without any problems, but it may very well be that the tool would report archive being corrupted and wouldn’t open it at all.

3. At least one file in the archive is greater than 4GB.
I didn’t try, but I assume the tool that doesn’t use information about the archive stored in the central directory, but just enumerates through all the files, would be able to open an archive, but extraction would most probably fail, since the file size and its offset in the archive are reported incorrectly, so the tool may try to look for the file on the wrong place and expect to extract/decompress wrong number of bytes.
If the tool use information stored in the central directory, it won’t open the archive and will report it being corrupted, since reported file sizes and file offsets don’t match.

Off course, you can make various combinations of the above common cases, with combined resulting behaviour.

As a conclusion to this long and hopefully informative story, I’ll repeat once more: if you plan to share your big archives (more than 65535 files or/and archive bigger than 4GB or/and file in archive bigger than 4GB) with people using other operating systems or other archiving tools on Mac, DO NOT USE Apple archiving tools built into the Mac OS X. I’d also like to point out that Springy uses central directory to open and gather information about ZIP archive and files inside, but with built in tricks and workarounds, just to be able to process faulty archives made by Apple tools. Even more, if you open such an archive and modify it, Springy will automatically “fix” it into a proper Zip64 archive!

Apple is aware of all these bugs, I reported them quite some time ago. Unfortunately, nothing has happen since, the status of the bugs is still “open”.

Dragan