Saturday, January 26th

Decluttering and fixing other people's bugs

Cat ScanSince I moved into my small apartment in Paris, I've become rather obsessed with "de-cluttering".

One of the things that I have the most trouble getting rid of is my old lecture notes. If only I had had a laptop and OneNote at the time, I wouldn't have this problem, but today the fact is that I have quite a lot of notebooks and photocopies taking up valuable space.

What I decided to do is to start scanning and recycling. I use Microsoft Office Document Imaging (MODI) to scan and perform OCR. I stated a while ago on a flatbed scanner at home, but that took forever. Fortunately, we have one of those multipurpose photocopiers at the office with an automatic feeder that will quickly scan my documents, even the two-sided ones.

Now, things have not been as smooth as they should have been, though. Mostly because, and I'm sad to say it, bugs in Microsoft software.

Bug number 1 was when MODI started crashing every time I was going to save a document. That was annoying because unsaved images are not really useful. So I started to try to figure out what was happening and realized that the program crashed as soon as I clicked on the File menu. Since the menu itself is pretty static, I assumed the problem had to come from the only dynamic part: the Recent File List. In Microsoft software, these lists are often stored in the registry, so I open regedit and started looking for the right key. That wasn't easy, but I finally found it:

HKEY_CURRENT_USER\Software\Microsoft\MSPaper 12.0\Recent File List

As you can see, the path to the key is quite obvious when you know the name of the program you are looking for and the path to the executable:

%ProgramFiles%\Common Files\Microsoft Shared\MODI\12.0\MSPVIEW.EXE

(Yes, it's called irony)

Bug number 2 is slightly more annoying because it can't be fixed easily. It seems that Windows Desktop Search (WDS), the service responsible for indexing files on Windows Vista, is "too secure" to index the contents of TIFF files, which is what I'm generating, because I already have a bunch of lecture notes in this format.

You see, in order to index the contents of files, WDS uses filters or IFilters, which are libraries that are also used by other Microsoft programs. But it would seem that the IFilter for TIFF files (and MDI files for that matter) does things (probably creates temp files or something in an unauthorized directory) that Vista doesn't like, so the OCR contents of TIFF files are not indexed. And there's pretty much nothing I can do about it aside from finding a 3rd party filter, or write my own.

I didn't find 3rd party filters and I don't have the time to write my own, so I'm just waiting for the fix that Microsoft is working on. The only problem is that the only ETA I've found is "soon."

Other solutions would be to install Google's indexer (which I'm seriously considering) or to change format. But I'm not ready to do either one. So, for now, I'll just name my files appropriately.

blog comments powered by Disqus