Introduction

This article discusses multi-touch gestures on Android from a beginners perspective. It demonstrates an approach that allows for

"standard" gestures such as slide-to-move and pinch-to-zoom but also endevours to go beyond those and attempt turn-to-rotate.

Focus of the article will be the math required and how to capture the input required to compute the gestures, the example application will of course provide a way of capturing and rendering the result of gestures performed.

It will also discuss why the turn-to-rotate doesn't work that well, at least not on all devices.

The article does not aim to be a complete guide to gestures on Android, but rather a way in to understanding touch events and how they
can be used to manipulate an image.

Some experience working with Eclipse and the Android SDK is assumed, so I won't explain how to get that all set up (there are explanations to that than I could ever provide).

Background

While writing an Android game I ran into some weird behaviour when using multi-touch gestures to control the game. The problem stemmed from

the way my Android device handled multi-touch events when the touch points line up either vertically or horizontally.

To investigate the matter I put together a small app where I could test the behaviour in isolation, this article is based on that application.
I've uploaded a of me playing around with the application, the quality of the video is quite poor, apologies for this.

Using the code

Download the project, unzip and import into your Eclipse workspace, I wrote this one in Pulsar but any Eclipse installation with an appropriate Android SDK installed should do the trick.

The requirements

The example application aims to do three things:

  • Drag-To-Move
  • Pinch-To-Zoom
  • Turn-To-Rotate

I'll explain the maths behind all of them but first let's start at the beginning, capturing multi-touch events.

The basics

The Activity

In order to get the application up and running I created a new Android project in Eclipse and named it Gestures.

Since I needed an image to rotate, I added advert.png to the res/drawable-mdpi folder so that it automatically got added to my resources.
From the applications Activity (GestureActivity) I then load that image and pass it as a Bitmap into a View implementation called SandboxView.
And that is all that the activity has to do for this small demo app; it loads the resource and sets the view;

public class GesturesActivity extends Activity {  @Override  public void onCreate(Bundle savedInstanceState) {    super.onCreate(savedInstanceState);    Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.advert);    View view = new SandboxView(this, bitmap);    setContentView(view);  }}

The View

The SandboxView is responsible for both rendering the image resource using the appropriate transform, but also to calculate that transform using touch events captured on that View.

To capture the touch events a call to setOnTouchListener(OnTouchListener) has to be made to set the listner instance that will handle the events. In this article, for simplicity, the
listener is the view itself and it is hooked up in the constructor.

public class SandboxView extends View implements OnTouchListener {  private final Bitmap bitmap;  private Matrix transform = new Matrix();  private Vector2D position = new Vector2D();  private float scale = 1;  private float angle = 0;  public SandboxView(Context context, Bitmap bitmap) {    super(context);    this.bitmap = bitmap;    setOnTouchListener(this);  }}

The view is initialized with a Bitmap which is the image that will be manipulated using the touch events. A Matrix called transform

is used to make sure the image is rendered with the correct translation (or position), rotation and scale. The position, scale and angle
are variables to be manipulated by the touch event, and they'll be used as post-transform on the transform.

To have the View render the image I've overridden onDraw(Canvas canvas) instead of relying on a layout defined in a resource xml. The reason for doing

so is that it allows full control over the rendering, both in terms of what is rendered but also when it is rendered (to an extent).

@Override  protected void onDraw(Canvas canvas) {    super.onDraw(canvas);    Paint paint = new Paint();    transform.reset();    transform.postTranslate(-width / 2.0f, -height / 2.0f);    transform.postRotate(getDegreesFromRadians(angle));    transform.postScale(scale, scale);    transform.postTranslate(position.getX(), position.getY());    canvas.drawBitmap(bitmap, transform, paint);  }

Four transforms are applied to get the final transform of the image:

  • 1. Translate to (negative half width, negative half height), this will make cause the image to center around (0, 0) which helps with the subsequent rotation and scaling.
  • 2. Rotate by the angle calculated from the touch events. This has to be converted to degrees as that's what the postRotate method accepts (I like to work in radians).
  • 3. Scale by the scake calculated from the touch events. I scale uniformely here (i.e. the vertical scale is the same as the horizontal) because that's the "normal" behaviour for Pinch-to-Zoom but there's no reason why the touch event can't also dictate the amount of scale per axis.
  • 4. Translate to the position calculated from the touch events.

Remember that matrix multiplication (which is what's going on here) is order dependant, so rearraing these will produce weird results.

Capturing touch events

The basics

On the Android platform, one way of capturing touch-events is to set a OnTouchListener on a View, and that's the approach I've taken in this article.

There are of handling gestures on Android, but I've gone for this approach because
it's a low-level way of doing things and that makes it easier to understand what's going on. I think.
As described by the View code listing above, the view itself is an implementation of OnTouchListener so this is simply passed to the setTouchListener
call in the constructor. The View does not have to be the implementation but since this is such a small example application I've kept it that way out of convinience.

When the setTouchListener is called the touch handle method, boolean onTouch(View v, MotionEvent event), will be called whenever a touch event is generated. This happens when

the users touches the screen, obviously.
The parameter of type MotionEvent that is passed to this method contains required to handle gestures, but it's not necessarily in a very convinient format.

Information such as X and Y coordinates of the touch event is included but for handling gestures we're more interested in the motion rather than just current coordinates. To detect motion we need to track both

current, but also previous positions so that the difference between them can be calculated. This difference, or delta, then tells us what gesture has been performed.
The MotionEvent class contains history (the size of which can be queried using the getHistorySize method, and that contains information about events that happened between this call and the previous.
That means that if events are happening faster than we're processing them we still get to hear about them, which is cool. We cannot rely only on history though, and because of this we need to track our own history.

Tracking history

I implemented a helper class called TouchManager that helps me keep track of not only a bit of motion history, but the position and history of all current touches. Remember that we're trying to implement

pinch-to-zoom here and that requires us to track not one but two simultaneous touch events, one for each finger.
It's easy to use to describe things like points and directions, so my example application contains a simple two-dimensional vector implementation, Vector2D.
There are many good articles on vectors, vector math and implementing vectors so I won't go into too much detail on this.

Structure of TouchManager

The touch manager is a fairly simple class, designed to record and store the events of N number of simultaneous touches for the current and previous touches.

It's constructed initializes the basics:

public class TouchManager {  private final int maxNumberOfTouchPoints;  private final Vector2D[] points;  private final Vector2D[] previousPoints;  public TouchManager(final int maxNumberOfTouchPoints) {    this.maxNumberOfTouchPoints = maxNumberOfTouchPoints;    points = new Vector2D[maxNumberOfTouchPoints];    previousPoints = new Vector2D[maxNumberOfTouchPoints];  }  ...}

One array of current points and one array for precvious touches.

The data stored in these arrays is then exposed through the following methods (index being the "id" so to speak of the touch, first touch gets index 0, second gets 1, typically):

public class TouchManager {  // Returns true if touch index is pressed  public boolean isPressed(int index) {  ...  }  // Returns the number of current touch points  public int getPressCount() {  ...  }  // Returns the delta between current and previous touch with index 'index'  public Vector2D moveDelta(int index) {  ...  }  // The the (x, y) point for touch index  public Vector2D getPoint(int index) {  ...  }  // The the (x, y) point for previous touch index  public Vector2D getPreviousPoint(int index) {  ...  }  // The the vector that is the difference between two simultenous touches  public Vector2D getVector(int indexA, int indexB) {  ...  }  // The the vector that is the difference between two previous simultenous touches  public Vector2D getPreviousVector(int indexA, int indexB) {  ...  }}

Handling onTouch

When a touch event fires, the event is passed directly to the TouchManager using the void update(MotionEvent event) method. This method is responsible for inspecting the event

and populating the backing arrays.

The first thing to do is to figure out what kind of event this was, was caused by the user pressing the screen, dragging his/hers finger across the screen of was the finger lifted off the screen.

This information is contained in the action, but a bit of is required in order to make sense of it.

public void update(MotionEvent event) {  int actionCode = event.getAction() & MotionEvent.ACTION_MASK;  if (actionCode == MotionEvent.ACTION_POINTER_UP || actionCode == MotionEvent.ACTION_UP) {    int index = event.getAction() >> MotionEvent.ACTION_POINTER_ID_SHIFT;    previousPoints[index] = points[index] = null;  }  else {    for(int i = 0; i < maxNumberOfTouchPoints; ++i) {      if (i < event.getPointerCount()) {        int index = event.getPointerId(i);        Vector2D newPoint = new Vector2D(event.getX(i), event.getY(i));        if (points[index] == null)          points[index] = newPoint;        else {          if (previousPoints[index] != null) {          previousPoints[index].set(points[index]);        }        else {          previousPoints[index] = new Vector2D(newPoint);        }        // Sanity check, if it moves by too much then ignore it        if (Vector2D.subtract(points[index], newPoint).getLength() < 64)          points[index].set(newPoint);        }      }      else {        previousPoints[i] = points[i] = null;      }    }  }}

And that's essentially it for recording the touch events. At this point we know the (X, Y) of the users fingers position as well as where on the screen the finger just was, using this we can start looking

into determining how to move the on-screen content accordingly.

Gestures

Drag-To-Move

Drag-To-Move is the simplest gesture because it only requires one finger. Essentially as the user drags a single finger across the screen, we want the content to move the same distance in the same direction.

As illustrated in the diagram below:

Drag.png

Blue square is the screen, green square is the content. Green circle is current touch, red circle is previous touch.

As the diagram shows us, we want to move, or translate, the green square by the difference between the two vectors that make up the first and the second touch point.
If the current point is (2, 2) and the previous one was (5, 5) then we would want to add

(2, 2) - (5, 5) = (-3, -3)

to the contents current location. That then means that if the content was

at position (3, 3) it's new position would be

(3, 3) + (-3, -3) - (0, 0).

Methods required to do that piece of simple maths are available on the Vector2D implementation.

Simple. And that's Drag-To-Move.

In the onTouch method, after the touch information has been recorded in the TouchManager this is applied to the position of the content provided only one finger is currently down:

public boolean onTouch(View v, MotionEvent event) {  try {    touchManager.update(event);    if (touchManager.getPressCount() == 1) {      position.add(touchManager.moveDelta(0));    }    else {      if (touchManager.getPressCount() == 2) {        ...      }    }    invalidate();  }  catch(Throwable t) {    // So lazy...  }  return true;}

The method moveDelta is called with 0 as an argument meaning that we ask for the information about the first (and only finger).

Pinch-To-Zoom

Pinch-To-Zoom is slightly more complicated because it involves two fingers, but not much as is still pretty much only about movement delta.

As two fingers pinch or "un-pinch" we need to figure out by how much and then apply that delta as a scale to the content as show in the diagram below:

Pinch.png

In the left hand side diagram, green is position of the first touch, blue is position of second touch.

In the right hand side diagram green and blue is current position of first touch and second touch, respectively, while red and purple are orgininal positions for first and second touch.

The amount to zoom in or out can be calculated by looking at the relative differance in the length of the vectors that make up the distance between first and second touch.

That means that we're in the Pinch-To-Zoom we need to figure out two distances:

PinchDistance.png

The white line in this diagram represents the distance between the two fingers when they first touched the screen, while the black distance is the distance between the fingers after they have moved apart.

To calculate these two vectors is simply a matter of subtracting one position from the other (which is an operation we already did in the Drag-To-Move gesture). When the two vectors have been calculated

then the quoteient of their lengths is the factor of scale that we need to apply to the current scale.
That yields;

White vector; PrevPos1 - PrevPos2 = PrevDeltaVec

Black vector; CurrentPos1 - CurrentPos2 = CurrentDeltaVec

Scale adjustment; Scale = Scale * length(CurrentDeltaVec) / length(PrevDeltaVec)

Or, expressed in code (where scale is the member variable of the SandBoxView);

Vector2D current = touchManager.getVector(0, 1);  Vector2D previous = touchManager.getPreviousVector(0, 1);  float currentDistance = current.getLength();  float previousDistance = previous.getLength();  // Guard against division by zero  if (previousDistance != 0) {    scale *= currentDistance / previousDistance;  }

So if the first distance was 34 and the second distance was 64 (and the current scale is 1.0), then the new scale is 1.0 * 64 / 32 = 1.0 * 2 / 1 = 2.0.

Increasing the scale zooms in.
Again, most of the math is hidden by the TouchManager and Vector2D classes.

Turn-To-Rotate

When it comes to Turn-To-Rotate, alot is common with the Pinch-To-Zoom; two fingers are used and we're inspecting not so much the absolute positions of the touches but rather a delta between

the current and previous touches.

Rotate.png

In these diagrams, and yes I am aware that I am not very good at creating diagrams, the colors have the same meaning as in the Pinch-To-Zoom diagram.

What we're looking for is the change in angle between the vectors that you get from subtracting one touch position from the other.

That means that we want the angle between the vector you get if you subtract the position of the first finger from the position of the second finger, and the vector you get if you do the same for the previous positions.

RotateAngle.png

Letting the black and white vectors represent the same as in the Pinch-To-Zoom diagram, we want the angle between them, marked with a yellow arc. Now, you might argue that two vectors have not one, but two angles between them, one of less than 180 degrees and one that is more than 180 degrees.

We always want the smaller of the two, because that's the most likely scenario.

Euclidean teaches us that the dot product of two vectors is equals to cosine of the angle between them multiplied by the product of their lengths. From this we can derive the angle since we have both the lengths and the dot product.

And that's what I tried first, but that approach does not work as it only gives us the magnitude of the angle, there is no sign on it (or there is but it's always positive). That means we can distinguish a clock-wise turn from a counter-clock-wise.

In order to find the signed distance between two vectors we can use , atan2.

For normalized vectors A and B, the signed angle between them is;

deltaAngle = atan(B.y, B.x) - atan(A.y, A. x)

public class Vector2D {  ...  public static Vector2D getNormalized(Vector2D v) {    float l = v.getLength();    if (l == 0)      return new Vector2D();    else      return new Vector2D(v.x / l, v.y / l);  }  public static float getSignedAngleBetween(Vector2D a, Vector2D b) {    Vector2D na = getNormalized(a);    Vector2D nb = getNormalized(b);    return (float)(Math.atan2(nb.y, nb.x) - Math.atan2(na.y, na.x));  }}

That means that to get the new rotation angle, simply take the current angle and add this delta angle to it.

Points of Interest

On the device i tried this, and you will need a physical device to try it on as the emulator does not support multitouch, the hardware was not very good at capturing multiple touches.

It was especially bad when the fingers lined up either horizontally or vertically. This is very visible in the showing the example application.

I believe this is mainly down to the hardware of the device in the video, as I tried the code on a newer device and it worked fine (or at least alot better). But limitations like this are

important to keep in mind when developing mobile applications, whenever you decide to venture out from the simples parts of the API it becomes more an more important to test the
application on a wide range of devices. Much like web development work has to be tested across multiple browsers.

Apologies for the poor quality video. It was captured using a Canon IXUS 9015. Held by me, while doing the gestures. In poor lighting. No sound, cause the my kid just went to sleep :)

As always, any comments are most welcome.

History

  • 2012-01-24; First version