Sunday, January 24, 2010

Augmenting Interactive Tables with Mice & Keyboards

Paper written by: Bjorn Hartmann (Standford University)
Meredith Ringel Morris, Hrovje Benko, and Andrew D. Wilson (Microsoft Research)

Comments: Gus Zarych, Jo Anne Rodriguez

Summary:
This paper explores the use of a large interactive tabletop with wireless mice and keyboards. They start off by listing some problems that occur with interactive tabletops and go onto what can be used to solve them. One problem with a tabletop might be the direct touch implementation because it may limit the precision and responsiveness of the input. This is why we use styli now for certain devices that are too small for direct touch (ex. using our fingers) to navigate around an interface. Lately tabletops have become larger to where direct touch is more viable and can also allow more input devices to be used. This paper focuses on the use of mice and keyboards in order to interact with the table and each other so that it can provide higher precision, performance, and enable interaction with distant objects. The use of these devices will also minimize physical movement. The paper describes a scenario about a group of people doing research for a project. It describes how the table works and how they are able to interact each other. If someone uses a keyboard, there is a screen that pops up in front of them to where they can navigate and do search queries or write documents. Each person has a specific color on the tabletop to distinguish which input device they are using and have associated to their screen. If two keyboards are placed together then they are connected and they are able to jointly work on a screen. Then the paper goes on to tell about the different ways the devices can be linked and how the users can use them together as a group in order to edit/create projects.


One way to link a device is called Link-by-Docking which creates an associated by the proximity of the device. Wherever the keyboard is placed, one can drag the digital screen next to it so that they are linked. Another way is called Link-by-Placing which is simply placing a keyboard next to a digital screen that is already present on the tabletop in order to link them.

Contextual command prompt happens when there is no target screen, one can place a keyboard down on the table and a text box will appear above the keyboard. The box tracks where the keyboard is and interprets the meaning of the text that has been entered as to what it needs to do. Pose-based input modification allows you to place two keyboards together so that they are joined and enable join search queries. Not only are there ways to connect a keyboard to a digital screen on the table, or connecting two keyboards together onto one screen, there is also ways to connect a mouse to a specific keyboard and screen. Remote object manipulation allows a user to click the mouse, touch an object on the table and move them. One can also use two mice without having to manipulate the algorithm. Leader line locator helps with finding your cursor on a large tabletop amongst other user's cursors. In order to connect a keyboard with a mice there are two options again, link-by-proximity which you place a mouse next to the keyboard you want to link to, and link-by-clicking which is where you move the cursor to the area associated with a keyboard and click to associate the two devices together. A user can link one mouse to multiple keyboards by doing this as well.



When using a tabletop with multiple users it is more efficient to use the frame choices that are available when using the interactive tabletop and input devices. To get the best precision from a mouse it is possible to create a reference frame that the mouse will interact with. Since the tables are becoming larger and it is easier to have multiple people working on them, using a frame like this will help one not move into someone else's area on the table.

For projects that require certain users working on certain parts, there is an option for a keyboard to ask for the user's credentials before beginning interaction. This will associate their IDs with whatever work they do.

Discussion:
I think this paper is interesting/significant because the generations of computers and electronic devices are becoming very touch oriented. I think this will help companies have a larger group being able to work with each other on the same project, in the same room without them having to be communicating through their cubicles (or offices). There could be some faults with the devices actually becoming associated with each other, or the way the interact with each other, but the idea seems to be pretty great. I think this could be turned into a business wide thing for certain companies because it can allow multiple people working on a project, and each person can be logged in to where all of their work is associated with their ID. The "bosses" can see what each person has worked on and can also do reviews for people by seeing the work that they have done, this could also be a bad thing though but in a way it could allow businesses to be more efficient.

Sikuli: Using GUI Screenshots for Search and Automation

Paper written by: Tom Yeh, Tsung-Hsiang Chang, and Robert C. Miller

Comment: Jesus, Zachary

Summary:
Sikuli us a "visual approach to search and automation of graphical user interfaces using screenshots" (183). Basically this application is a way to query with screenshots in order to get better results/answers to your questions. It starts off by relating situations such as pointing to an object asking what it does, to situations on the internet where you are looking at something and are confused about either what it does, why it's done something, or how you can navigate the application etc. Sikuli allows a user to draw a box around something and actually query in a search engine using that picture. There are three components that make up Sikuli. These consist of the screenshot search engine, the actual interface for using the search engine to query, and another interface to add screenshots with custom annotations attached to them to the index.
The screenshot search engine has indexed screenshots taken from various online tutorials, books, and official documents. When a user enters a picture into the query the search engine first looks at the text that surrounds the image. Then it uses any visual features that it can detect. Whatever screenshots that can be represented as "visual words" are using indexed using an inverted index that has multiple entries for a particular word. Thirdly, the engine can index screenshots according to embedded text. To improve results the engine will use 3-grams and each word is treated as a visual word. In order to get a better idea about how all of this will work, they created a protoype that had a database collection of about 102 documents from various sources that would help with giving explanations on different applications. After creating this they did a user study to test two hypothesis, "(1) screenshot queries are faster to specify than keywords queries, and (2) results of screenshot and keyword search have roughly the same relevance as judged by users" (185). Each participant was given a random dialog box and were asked to do different queries depending on the dialog box they received. After querying each user was asked to identify the top 5 results as relevant, or irrelevant. After all of these tasks were completed they were then asked to fill out and answer some questions that were written for them.
They used this study to test their application for keyword queries against the screenshot queries and found that the average time was less than half as long for screenshots as it was for keywords. The number of relevant results on the other hand were very close when compared and wasn't a significant difference (which can be good). They noticed that throughout this study some users learned how to do screenshot queries very quickly. In order to evaluate their application they used precision and recall and examined the top 10 matches for both the screenshot and keyword queries. Now that they had an understanding of how Sikuli would work, they developed an editor in order for users to write visual scripts in order to do certain tasks on their computer. One task was minimizing all active windows that someone would have open on their computer. Another one was deleting documents of multiple types. It could search for items that share the same icon and then delete them all if wanted. There was also a tracking bus movement and navigating a map application that would search for images on the map find a similar pattern. One of the more interesting, and possibly more convenient use would be to respond to message boxes automatically. Vista gives you pop ups every time asking you to choose if you really want to do something, and this script would allow you to make automatic responses to all of them that could pop up. The last one was image recognition to see whether a baby has rolled over or not. I wasn't sure if they actually marked the baby's forehead with a marker, or if it was digitally enhanced, but the script would monitor a baby/the image of the baby in order to see if it could detect the special marker placed on the baby's head. If it could not it would alert the user that it needs to check the baby to see if had rolled over or not. Below are pictures that were provided with each example.


There were a few problems that were discussed with application. One problem is the different themes that can be used on an operating system and the different backgrounds that users may have on their computer. When using a screenshot query there would need to be something initialized that would ignore the unnecessary portions of the picture so that it can better optimize a search. The second problem they mentioned was the visible screen. Sikuli can only see the screen that is visible to a person and not any of the windows that are open behind other windows. They thought that this could be worked on by creating a platform or application specific technique in order to overcome this. They didn't really go into any details on how to fix these problems yet, so I'm assuming this will be some of their future work.

Discussion:
I thought this paper was very interesting because it could allow searching to become very simple and possibly much more accurate than having to come up with queries on your own. I think that this work could be expanded in it's possible uses. Such as creating different scripts in order to do different tasks on a computer that someone does every day. Such as a script to open a browser and immediately open tabs of emails that a person checks everyday or a website that they always go to, like facebook, so that it is more convenient to have all of the tabs open at once. Thats just a small idea, I'm sure there are much better uses for this. It would be very interesting to see if we can make searching faster and more accurate by using screenshots instead of keyword queries. I don't think there are really faults in this, but one way to possibly better optimize the indexing portion of the search would be to use k-nearest neighbors or possibly a larger number of k-grams.

Wednesday, January 20, 2010

A Practical Pressure Sensitive Computer Keyboard

Paper Written By: Paul H. Dietz, Benjamin Eidelson, Jonathan Westhues, and Steven Bathiche (The Applied Sciences Group - Microsoft Corporation)


Summary:
This paper described how a normal keyboard worked and then compared the pressure sensitive keyboard they created to it. The picture below shows the structure that is located beneath the keys of an everyday keyboard.
Keyboards have not changed much over the years since computers were invented, but the underneath portion has. IBM used to have a keyboard that had a spring mechanism, but due to the loud noise it made manufacturers tried to go with a more quiet approach. This paper talks about how their pressure sensitive keyboard looks and feels just like a regular keyboard with the quiet mechanism, but it can also report the pressure on each key independently. It could be manufactured just as easily as normal keyboards, but would only cost a slight bit more. Below is a picture of the mechanics behind the pressure sensitive keyboard.

This picture allows you to see that it is just an extension off of the original keyboard design that was shown before this. It is designed to decrease the resistance from the top layer to the bottom layer while using the space provided by the dome structure beneath the key. The Applied Sciences Group from Microsoft believed that it was more important for the keyboard to be a great keyboard more than it was to be a pressure sensitive one. It consists of many resistors that connect into a row/column pairing. The pressure is measured by using the same voltage in all of the rows and columns except for one where the reading will begin. It goes through this process until every pair has been read. This design also accomplished the keyboard problem called "ghosting". "Ghosting" is when there are three keys being pressed (two on the same row and two on the same column). This keyboard has almost eliminated this problem. There were three applications of this keyboard that they mentioned. These consisted of gaming, emotional instant messaging, and general typing. For gaming it would help enhance the way a person plays a computer game. It could "control the intensity or degree of some keyboard function by the force level used to depress that key" (57). For the emotional instant messaging it would increase the font with the intensity of the key stroke and even change the font depending on how you type. For general typing this would help allow cleaner typing. When a person is typing rapidly it is often easy for them to hit an extra key by accident or press down too long to where a character is repeating. The pressure sensitivity would be able to realize which key is actually being pressed and which key was hit by accident. It also implemented things for the backspace key. If the user presses lightly it will delete each character one by one, or if it is pressed with slightly greater force then it will delete an entire word. This keyboard can be easy to learn by adjusting software in order for a user to learn at their own pace, and could easily be manufactured for a premium more than other keyboards.

Discussion:
I think this paper was very interesting and this keyboard could change the way we type documents or chat online. It could even make gaming more popular because you would have even more control over what happens in the game. I don't really see any faults with this work. It was able to overcome a normal keyboard problem that no other keyboards have seem to accomplish. It has many applications of how this keyboard could be better than the average keyboard. I think that future work could be using this with Word. Letting a group of people test out the keyboard and a Microsoft Word "like" program and have them type out a paper. Some testers could do free writing while others could write a paper for a class or something similar. I know I would love to try this when using an instant messenger so that people would understand if I was angry, happy and/or excited.

Disappearing Mobile Devices

Paper Written by: Tao Ni and Patrick Baudisch


Summary:
Disappearing Mobile Devices dealt with ideas of how minimizing mobile devices and their interface hardware would affect the use of a device. Devices such as laptops and cell phones have been minimized over the years but have been limited to the size they can become due to human constraints. Such constraints consist of the screen needing to be large enough for people to read and the keyboard has to be large enough for people to use their fingers to type. The smaller the device, the more difficult it can be for a person to interact with it. It discusses how touch, pressure, and motion have limits on how small the device can be and how a person can react with it. Now there were two different case studies that this paper discussed. Both dealt with motion and whether or not the device could detect the motion that someone has done, such as directional motions (left, right, up, down) that were used in case one. This case had 12 participants that were each asked to choose an item by performing the correct directional that was given to them. The chart for the results is listed below.


Reasons for the errors posted were described in detail, such as the participant shaking a little bit while holding the mouse or how they pointed the mouse when using it. The second case study dealt with the way people write an alphabet. The interface was given a preset "gesture" alphabet and many of the letters had different ways of writing them. This case had 24 participants that entered one character per trial and used either the Graffiti interface or the EdgeWrite interface that was provided to them. Before the trial they each practiced each character about 10 times and then performed 8 blocks that each consisted of all the letters of the alphabet in a random order. A picture for the alphabet is listed below. The first picture shows the alphabet used in the EdgeWrite interface while the second picture shows the alphabet for the Graffiti interface.



Both studies were done on the same mouse device that they created. The differences between the two was the marking case participants used their finger to motion while the second case for the alphabet stroke participants used their whole hand.

Discussion:
This paper is interesting because it's testing how small we can make devices before they stop doing their intended purpose. It seems quite difficult to miniaturize devices that human input is used for because of certain constraints that need to be there in order for a person to use the device. I would have liked to have seen a video on this study in order to get a better understanding of what the paper was discussing, but I think the general idea was to see how the device that they created would pick up different motions a person did with either their finger or entire hand. Some faults with the work may be in the way each interface was used. There could have been computer errors that were not seen at the time, or there could have easily been some user error since there different participants for each case. If they had used the same participants for both cases there may have been better results. Future work could include having a small screened interface that a person could interact with, or possibly making the device into another form. For these studies it was shaped like a mouse, but what if they tried making it into a pen like object which would make this device even smaller. It would be interesting to compare the results from both objects to see if the smaller one could perform just as well as the slightly larger one.