Posts in category mindstorms

The Floppy-Disk Archiving Machine, Mark III

"I'm not building a Mark III."

Famous last words.

I made the mistake of asking my parents if they had any 3.5" floppy disks at their place.

They did.

And a couple hundred of them were even mine.

Faced with the prospect of processing another 500-odd disks, I realized the Mark III was worth doing. So I made a few enhancements for the Floppy Machine Mark III:

  • Changed the gearing of the track motor assembly to increase torque and added plates to keep its structure from spreading apart. The latter had been causing the push rod mechanism to bind up and block the motor, even at 100% power.
  • Removed the 1x4 technic bricks from the end of the tractor tread, and lengthened the tread by several links and added to the top of the structure under those links. This reduced the frequency that something got caught on the structure and caused a problem.
  • Extended the drive's shell's lower half by replacing the 1x6 technic bricks with 1x10 technic bricks; and a 1x4 plate on the underside flush with the end. This made the machine more resilient to the drive getting dropped too quickly.
  • Added 1x2 bricks to lock the axles into place for the drive shell's pivot point, since they seemed to be working their way out very slowly.
  • Added 1x16 technic bricks to the bottom of all the legs, and panels to accommodate that, increasing the machine's height by 5" and making it easier to pull disks out of the GOOD and BAD bins.
  • Added doors at the bottom of the trays in the front to keep disks from bouncing out
  • Added back wall at bottom of the trays in the back to keep disks from bouncing out.
  • Moved the ultrasonic sensor lower in an attempt to reduce the false empty magazine scenario. This particular issue was sporadic enough that the effectiveness of the change is hard to determine. I only had one false-empty magazine event after this change.
  • Added a touch sensor to detect when the push rod has been fully retracted in order to protect the motor. Before this, the machine identified the position of the push rod by driving the push rod to the extreme right until the motor blocked. This seems to have had a negative effect upon the motor in question. Turning the rotor of that poor, abused motor in one direction has a very rough feel. This also used the last sensor port on the NXT. (One ultrasonic sensor and three touch sensors.)
  • Replaced the cable to the push rod motor with a longer one from HiTechnic.
  • Significantly modified the controlling software to calibrate locations of the motors in ways that did not require driving a motor to a blocked state.
  • Enhanced the controlling software to allow choosing what events warranted marking a disk as bad and which didn't.
  • Enhanced the data recovery software to allow bailing on the first error detected. This helps when you want to do an initial pass through the disks to get all the good disks archived first. Then you can run the disks through a second time, spending more time recovering the data off the disks.
  • Enhanced the controlling software to detect common physical complications and take action to correct it, such as making additional attempts to eject a disk.

With those changes, the Mark III wound up much more rainbow-warrior than the Mark II:

floppy machine mark iii

And naturally, I updated the model with the changes:

floppy machine mark iii model

The general theme for the Mark II was to rebuild the machine with a cleaner construction, reasonable colors, and reduced part count. The general theme for the Mark III was to improve the reliability of the machine so it could process more disks with less baby-sitting.

All told, I had 1196 floppy disks. If you stack them carefully, they'll fit in a pair of bankers boxes.

boxes of disks

And with that, I'm done. No Mark IV. For real, this time. I hope.

Previously: the Mark II

The Floppy-Disk Archiving Machine, Mark II

Four and a half years ago, I built a machine to archive 3.5" floppy disks. By the time I finished doing the archiving of the 443 floppies, I realized that it fell short of what I wanted. There were a couple of problems:

  • many 3.5" floppy disk labels wrap around to the back of the disk
  • disks were dumped into a single bin
  • the machine was sensitive to any shifts to the platform, which consisted of two cardboard boxes
  • the structure of the frame was cobbled together and did not use parts efficiently
  • lighting was ad-hoc and significantly affected by the room's ambient light
  • the index of the disks was cumbersome

I recently had an opportunity to dust off the old machine (quite literally), and do a complete rebuild of it. That allowed me to address the above issues. Thus, I present:

The Floppy-Disk Archiving Machine, Mark II

The Mark II addresses the shortcomings of the first machine.

Under the photography stage, an angled mirror provides the camera (an Android Dev Phone 1) a view of the label on the back of the disk. That image needs perspective correction, and has to be mirrored and cropped to extract a useful image of the rear label. OpenCV serves this purpose well enough, and is straight forward to use with the Python bindings.

The addition of lights and tracing-paper diffusers improved the quality of the photos and reduced the glare. It also made the machine usable whether the room lights were on or off.

The baffle under disk drive allows the machine to divert the ejected disks into either of two bins. I labeled those bins "BAD" and "GOOD". I wrote the control software (also Python) to accept a number of options to allow sorting the disks by different criteria. For instance, sometimes OpenCV's object matching selects a portion of a disk or its label instead of the photography stage's arrows. When that happens, the extraction of the label will fail. That can happen for either the front or back disk labels. The machine can treat such a disk as 'BAD'. When a disk is processed, and bad bytes are found, the machine can treat the disk as bad. The data extraction tool supports different levels of effort for extracting data from around bad bytes on a disk.

This allows for a multiple-pass approach to processing a large number of disks.

In the first pass, if there is a problem with either picture, or if there are bad bytes detected, sort the disk as bad. That first pass can configure the data extraction to not try very hard to get the data, and thus not spend much time per disk. At the end of the first pass, all the 'GOOD' disks have been successfully read with no bad bytes, and labels successfully extracted. The 'BAD' disks however, may have failed for a mix of different reasons.

The second pass can then expend more effort extracting data from disks with read errors. Disks which encounter problems with the label pictures would still be sorted as 'BAD', but disks with bad bytes would be sorted as 'GOOD' since we've extracted all the data we can from them, and we have good pictures of them.

That leaves us with disks that have failed label extraction at least once, and probably twice. At this point, it makes sense to run the disks through the machine and treat them as 'GOOD' unconditionally. Then the label extraction tool can be manually tweaked to extract the labels from this small stack of disks.

Once the disks have been successfully photographed and all available data extracted, an html-based index can be created. That process creates one page containing thumbnails of the front of the disks.

index of floppies screenshot

Each thumbnail links to a page for a disk giving ready access to:

  • a full-resolution picture of the extracted front label
  • a full-resolution picture of the extracted back label
  • a zip file containing the files from the disk
  • a browsable file tree of the files from the disk
  • an image of the data on the disk
  • a log of the data extracted from the disk
  • the un-processed picture of the front of the disk
  • the un-processed picture of the back of the disk

single disk screenshot

The data image of the disk can be mounted for access to the original filesystem, or forensic analysis tools can be used on it to extract deleted files or do deeper analysis of data affected by read errors. The log of the data extracted includes information describing which bytes were read successfully, which had errors, and which were not directly attempted. The latter may occur due to time limits placed on the data extraction process. Since a single bad byte may take ~4 seconds to return from the read operation, and there may be 1474560 bytes on a disk, if every byte were bad you could spend 10 weeks on a single disk, and recover nothing. The data recovery software (also written in Python) therefore prioritizes the sections of the disk that are most likely to contain the most good data. This means that in practice everything that can be read off the disk will be read off in less than 20 minutes. For a thorough run, I will generally configure the data extraction software to give up if it has not successfully read any data in the past 30 minutes (it's only machine time, after all). At that point, the odds of any more bytes being readable are quite low.

So what does the machine look like in action?

(Also posted to YouTube.)

Part of the reason I didn't disassemble the machine while it collected dust for 4.5 years was that I knew I would not be able to reproduce it should I have need of it again in the future. Doing a full rebuild of the machine allowed me to simplify the build dramatically. That made it feasible to create an Ldraw model of it using LeoCAD.

rendering of digital model

Rebuilding the frame with an eye to modeling it in the computer yielded a significantly simpler support mechanism, and one that proved to be more rigid as well. To address the variations of different platforms and tables, I screwed a pair of 1x2 boards together with some 5" sections of 1x4 using a pocket hole jig. The nice thing about the 5" gap between the 1x2 boards is that the Lego bricks are 5/16" wide, so 16 studs fit neatly within that gap. The vertical legs actually extend slightly below the top of the 1x2's, and the bottom horizontal frame rests on top of the boards. This keeps the machine from sliding around on the wooden frame, and makes for a consistent, sturdy platform which improves the machine's reliability.

The increase in stability and decrease in parts required also allowed me to increase the height of the machine itself to accommodate the inclusion of the disk baffle and egress bins.

What about a Mark III?

Uhm, no.

I have processed all 590 disks in my possession (where did the additional 150 come from?), and will be having these disks shredded. That said, the Mark II is not a flawlessly perfect machine. Were I to build a third machine, increasing the height a bit further to make the disk bins more easily accessible would be a worthwhile improvement. Likewise, the disk magazine feeding the machine is a little awkward to load with the cables crossing over it, and could use some improvement so that the weight of a tall stack of disks does not impede the proper function of the pushrod.

So, no, I'm not building a Mark III. Unless you or someone you know happen to have a thousand 3.5" floppy disks you need archived, and are willing to pay me well to do it. But who still has important 3.5" floppy disks lying around these days? I sure don't. (Well, not anymore, anyway.)

Previously: the Mark I

Update: the Mark III

Programming the Floppy-disk Archiving Machine

I used the nxt-python-2.0.1 library to drive the floppy-disk archiving machine. I don't see value in releasing the full source code for driving the machine as it is very tied to the details of the robot build, but there are a few points of interest to highlight. (The fact that the code also happens to be 200 lines of ugliness couldn't possibly have influenced that decision in anyway.)

Overview of nxt-python

The library provides an object oriented API to drive the NXT brick. First, you get the brick object like this:

import nxt.locator
brick = nxt.locator.find_one_brick()

Motor objects are created by passing the motor and the NXT port it is connected to:

import nxt.motor
eject_motor = nxt.motor.Motor(brick, nxt.motor.PORT_B)

Motors can be run by a couple of methods, but the method I used was the turn() method. This takes a powerlevel, and the number of degrees to rotate the motor. The powerlevel can be anything from -127 to 127, with negative values driving the motor in reverse. The higher the powerlevel, the faster the motor turns and the harder it is to block, but it will also not stop exactly where you wanted it to stop. Lower powerlevels give you more exact turns, but won't overcome as much friction. So I found that it worked best to drive the motor at high powerlevels to get a rough position, then drive it at lower powerlevels to tune its position. To determine how far the motor actually turned, I used motor.get_tacho().tacho_count. That value then allowed me to drive slowly to the correct position from the actual position achieved.

When a motor is unable to rotate as far as instructed at the powerlevel specified, it will raise an nxt.motor.BlockedException. While typically you should probably avoid having that happen, I found that by designing the robot to have a "zeroing point" that I could drive the motor to until it blocked, I could recalibrate the robot's positioning during operation and increase the reliability of the mechanism.

Implementation Details

In order to keep the NXT from going to sleep, I setup a keepalive with brick.keep_alive() every 300 seconds. I believe the NXT brick can be configured to avoid needing that. In the process, I discovered that the nxt-python library does not appear to be threadsafe; sometimes the keep_alive would interfere with a motor command and trigger an exception.

I structured my code so that I had a DiskLoadingMachine object with a brick, load_motor, eject_motor, and dump_motor. This allowed me to build high-level instructions for the DiskLoadingMachine such as stage_disk_for_photo().

Another thing I did was to sub-class nxt.motor.Motor and override the turn() method to accept a either a tacho_units or a studs parameter. This allowed me to set a tacho_units-to-studs ratio, and turn the motor the right number of turns to move the ram a specified number of studs.

Room for Improvement

I think there is room to enhance nxt-python's implementation of Motor.turn, or to add a Motor.smart_turn. The idea here is to specify the distance to rotate the motor and have the library drive the motor as quickly as it can while still making the rotation hit the exact distance specified. Depending on implementation, it might make sense to have the ability to specify some heuristic tuneables determined by a one-time calibration process. Drive trains with significant angular momentum, gearlash, or variable loadings may make it difficult to implement in the general case.

Alternatively, perhaps Motor.turn_to() would be a more robust approach: provide an absolute position to turn the motor to. It should then have a second parameter with three options: FAST, PRECISE, and SMART. FAST would use max power at the cost of probably overrunning the target, while PRECISE would turn more slowly and get to the correct position, and SMART would ramp up the speed to get to the correct position without overrunning it at the cost of a more variable rate. The implementation would also imply operating with absolute positions rather than specifying how much to turn the motor. There can be some accumulation of error, so such an implementation would need a method for re-zeroing the motor.

Making the library threadsafe is an obvious step for making this library more robust.

A default implementation of a keep-alive process for the brick object would also be worth considering.

Conclusions

Despite the threading issue, the nxt-python library was very useful and helped me quickly create a functioning robot. If you're looking to use a real programming language to drive a tethered NXT, nxt-python will serve you well.

3.5" Floppy-disk Archiving Machine

August 31st of last year, at the age of 89, my Grandfather passed away. I'm a computer geek, as was he, though his machines filled rooms, and mine, merely pockets. His software flew fighter aircraft. He worked on the Apollo missions. He wrote the first software by which to operate a nuclear reactor. That is a hard act to follow.

But as a computer geek, he had accumulated a large stack of 3.5" floppy disks: 443, of them in fact. And when he passed away, it became my responsibility to deal with those. I was not looking forward to the days of mindless repetition inherent in that task. So, I did what any self-respecting software engineer would do: I automated it.

Start with Lego Mindstorms, add a laptop running Fedora Linux, an Android Dev Phone 1, a good bit of Python code, and about the same number of hours of work, and you get this:

picture of floppy archiving machine

Watch it in action on YouTube

There are a number of interesting details in this build which I plan to write about in the coming weeks, so stay tuned.

Follow up articles: NXT control software, The Floppy-Disk Archiving Machine, Mark II