I had an enlightening conversation yesterday with a class from Texas A&M University.
Back when I was a graduate student at MIT, the focus of my research was on computer recognition of hand-drawn sketches. This was in the Architecture Machine Group, part of the Department of Architecture, and the work was part of an ongoing effort to advance the state of tools for architects and designers. (This turned out to be part of a much larger problem in how people interacted with computers and digital media, and the Architecture Machine Group morphed into what is now the MIT Media Laboratory).
Sketch Recognition, like machine vision, is the process of taking raw data, in this case from a data tablet, extracting features such as lines and curves, and ultimately determining what three-dimensional objects those lines and curves represent. The initial part of my work was developing algorithms that crunched the data streamed from the tablet, using not just the resulting two-dimensional image but also looking at how fast the pen moved and how hard it was pressed onto the tablet. For example, it was possible to detect a corner by finding where the user had slowed down to turn it. This bottom-up approach worked pretty well for extracting low-level features, but it became apparent that understanding the user's higher-level intentions required a top-down application of the context in which the drawing was made, e.g. knowing that the user was designing a house meant the program could look for patterns resembing a wall or a door. Also like machine vision, this was a problem in Artificial Intelligence and I took my data over to the MIT AI Lab's PDP-10 and wrote some programs in Sussman and McDermott's CONNIVER language. The results were promising, but even for a simple drawing the amount of computation required was prodigious. I recall Carl Hewitt coming up to the computer room to find out why the PDP-10 was so slow and suggesting I not run my program during the day. I finished my thesis, published a paper at the SIGGRAPH conference, and concluded that further advances in sketch recognition would await advances in AI and in Moore's Law. Meanwhile I moved on to more tractable topics such as how people would use digital media.
Over the next few decades, AI remained a long-term problem, but Moore's law made computers considerably faster and reawakened interest in sketch recognition. recently, I heard from Tracy Hammond, Assistant Professor at Texas A&M University and director of its Sketch Recognition Laboratory. She invited me to talk with her class, which I did yesterday. As happens in a lot of fields, what used to be a multi-year research project is now just a homework assignment. In this case, the class is Special Topics in Sketch Recognition (CPSC 689-608), and the first assignment is to write a simple sketch recognizer. The answer of when we will have an artificial intelligence as smart as a human is as elusive as ever, but a lot of progress has been made in domain-specific areas, and the students were very interested in applying AI to this domain. I look forward to visting the lab next time I am in Texas. In the meantime, you can look at their work here. (Shown at right are the raw data and the output of PaleoSketch.)