Hacking Vision?

An interesting idea from Mark Changizi from RPI: can one design pictures which, when interpreted by your vision, perform a computation? Press release here (note to RPI public relations department: you should probably make it so that the webpage address of your press releases can be copied from the browser address bar. Somewhere a web designer should be shot.) and paper in Perception published here.

The basic idea is to use the orientation information we glean from looking at objects to perform computations. Thus for example, Changizi suggest that we can represent zeros and ones via the two different orientations seen in this picture:

I See Zero and One. Taken from Changizi M, 2008, “Harnessing vision for computation” Perception 37(7) 1131 – 1134

Okay so far so good. I definitely see a zero and a one. Now the idea is that by putting elements like this together one can then have the part of your vision system which computes these orientations perform a computation. Cool idea, no? But, try as I might, I just can’t see how the gadgets described in the article work. For instance, here is the proposed NOT gate, which should flip the orientation of the input blocks:

<
I do not see a NOT. Taken from Changizi M, 2008, “Harnessing vision for computation” Perception 37(7) 1131 – 1134

So, is my visual system just messed up an not able to perform this computation? Do other see the computation? And if they can, then does this mean that I’m doomed to forever be not performing computations by just looking, whereas there may exist people who can do a whole eight bit adder just by looking?
This also makes me wonder whether there are any similar concepts in other senses: perhaps in sound? (Which leads naturally to: you may think you are listening to the latest song from Band of Horses but really you’re calculating the thirtieth digit of Pi)

10 Replies to “Hacking Vision?”

  1. I do see it, but I note that I see it for NOT(0) (the right-hand figure) a bit more easily than for NOT(1) (the left-hand figure); the latter reverses on me too easily (My perception of Necker Cubes tends to switch fairly easily and rapidly between the two alternatives).

  2. Hmm. I didn’t read the paper, and I admit this is a bit outside my area, so I may be missing something here.
    After staring at those figures, I thought it might be something that should be looked at more from an artist’s point of view and not try to pack too much into the illustrations conceptually. Because in a way it sounds like all this is is an attempt to create highly intuitive illustrations of building block thought processes.
    The second pair of figures in particular suffers from the kind of science-itis that drives designers nuts. (Don’t get me wrong, designers are riddled with their own weird faults.) Perspective in itself is just an optical trick, but here the image is confounded by the ambiguity Kevin C. noted as the Necker cube — a wire frame with missing or faulty depth cues. It’s putting you in the position of having to do so much conscious work to resolve the intended positions and purpose of the objects depicted, that it defeats the whole purpose of intuitive coding.
    I can’t help thinking that there’s some mumbo-jumbo here purposely hiding the fact that the authors were just too cheap to work with designers on this project. Clear illustration, like clear writing is just so… mundane. Doing useful stuff with mind bending illusions is a cool idea, I’m just not feelin’ it here.

  3. Imagine the bit falling into the right box, rotate it around the cone and when it comes out in the left box, it’s inverted. The problem with the rotation is you have to imagine it rotating in the direction of the light grey side. So for the 0 bit it rotates around in front of the cone, and for the 1 bit it rotates around in the back of the cone.

  4. I can see how it’s supposed to work, but it just seems too clumsy. I think that the cone is somehow supposed to “guide” the flip, but it’s too easy for me to see the whole image and detect the same orientation as the input. Also, my brain keeps wanting to go Escher on me with that hinge construction.

  5. Does this even have to be in 3D? The article mentions that transparency plays a role for the OR-gate.
    Changizi says “our visual system (the hardware) would automatically and effortlessly generate a perception, which would inform us of the output of the computation.” I don’t see how this would be effortless. Computing the outputs of primitive elements is easy in any scheme. The hard part is keeping the all current outputs in one’s head.

  6. Koray: >>>”Does this even have to be in 3D?…The hard part is keeping the all current outputs in one’s head.”

  7. All I can tell is that NOT(1) is messed up. The geometry is all weird. Maybe that is what makes it a 1?
    The 0 has clean geometry. Maybe thats what makes it clear/transducive.

Leave a Reply

Your email address will not be published. Required fields are marked *