Next-Gen App & Browser Testing Cloud
Trusted by 2 Mn+ QAs & Devs to accelerate their release cycles

Learn how to automate mobile gestures with Appium using W3C Actions API, mobile commands, and Appium Inspector for Android and iOS testing.

Wasiq Bhamla
March 15, 2026
Mobile gestures in Appium are touch interactions such as tap, swipe, scroll, drag, long press, and pinch that simulate how a user interacts with a mobile application.
Appium executes these gestures using the W3C Actions API, which sends low-level touch input events to Android or iOS automation frameworks.
In practice, these gestures are used to automate real user behaviors like scrolling lists, swiping carousels, dragging elements, or zooming images inside mobile applications.
What Are the Basic Gesture Actions in Appium
Appium provides several core gesture actions that simulate real user touch interactions on mobile devices. These gestures form the foundation of mobile test automation:
How to Implement Gestures With Appium
There are multiple approaches available for automating gestures in Appium, each with different levels of flexibility and platform support:
Appium provides several core gesture actions that simulate real user touch interactions on mobile devices. Following are the basic gestures supported in Appium:
If you haven't set up Appium yet, follow this Appium tutorial for complete installation and configuration steps.
Note: Run Appium tests across 10,000 real Android & iPhone devices. Try TestMu AI Now!
Before automating gestures, you need a working Appium environment with Node.js, the Appium server, and platform drivers (UiAutomator2 for Android, XCUITest for iOS) installed.
If you need platform-specific setup instructions, refer to this detailed walkthrough on how to install Appium.
Appium Inspector lets you visually build, test, and debug gestures before writing automation code. Download the latest version from their GitHub release page.
Before opening Appium Inspector, start the Appium server by executing the following command on the terminal:
appium server --port 4723 --use-drivers uiautomator2Once the Appium server instance has started, open the Appium Inspector and create the capabilities using the following capabilities as an example:
{
"platformName": "Android",
"appium:deviceName": "Pixel 9 Pro",
"appium:automationName": "UiAutomator2",
"appium:avd": "Pixel_9_Pro",
"appium:app": "/path/to/apps/android/wdio-demo.apk",
"appium:appWaitActivity": "com.wdiodemoapp.MainActivity",
"platformVersion": "15"
}Following is the screenshot for the Appium Inspector capabilities builder page:

Click on the Start Session button to start the Appium Inspector session on the running Appium server. It will automatically open the Emulator if not already open.
To build custom gestures in Appium, open the Gestures tab in Appium Inspector after starting a session, then use the gesture builder to define tap, swipe, pinch, and drag actions visually.
You can create a new gesture just by clicking on the + button available towards the bottom of the screen.

Let's take an example on how to create a simple swipe up gesture. Take a look at the below screenshot where you can see the swipe up gesture:

Let's break down the swipe up gesture created in this screenshot.
You can test the gesture by clicking on the play button available right before the save button. Once you click on it, the created gesture will get executed on the attached device, whether the real device or the Emulator.
You can also save this gesture by first providing the gesture title and description and clicking on the Save As button. Once you save the gesture, you will see the saved item in a list as shown in the below screenshot:

Apart from gesture building, Appium Inspector also supports element hierarchy inspection, page source viewing, and action recording. All of these features are explored in depth under Appium Inspector for apps.
The swipe down gesture follows the same Pointer Move → Pointer Down → Pointer Move → Pointer Up pattern as swipe up. The only difference is the second Pointer Move direction, which moves the end coordinate towards the bottom of the screen.

The swipe left gesture uses the same pattern. The second Pointer Move changes the end coordinate towards the left of the screen by decreasing the X-axis value.

The swipe right gesture is identical to swipe left except the second Pointer Move increases the X-axis coordinate, moving the end point towards the right of the screen.

The drag and drop gesture uses the same pattern as swipe. The starting coordinate is the center of the drag element and the end coordinate is the center of the drop zone element.

A tap gesture is simpler than swipe since it does not require a second Pointer Move. The sequence is Pointer Move to target → Pointer Down → Pointer Up. For double tap gestures, repeat the Pointer Down and Pointer Up steps again.

Zoom in requires two fingers working simultaneously. Both fingers start near the center of the screen. The index finger moves upward while the thumb moves downward, spreading apart.

For zoom out, the movement is reversed. Both fingers start apart and move towards the center of the screen, pinching inward.

Once you save a gesture in the Appium Inspector, you have an option to export the gesture in a JSON file.
I typically export gesture files and share them with my team so they can simply import them on their machine and start using the gesture right away.
Check out the screenshot below where the import and export gesture button has been highlighted:

Let's check out the exported JSON file example:
{
"name": "Drag Drop Gesture",
"description": "Drag and drop gesture...",
"actions": [{
"name": "Finger1",
"ticks": [
{ "id": "1.1", "type": "pointerMove", "x": 12.7, "y": 80.6, "duration": 0 },
{ "id": "1.2", "type": "pointerDown", "button": 0 },
{ "id": "1.3", "type": "pointerMove", "x": 30.2, "y": 34.1, "duration": 1077 },
{ "id": "1.4", "type": "pointerUp", "button": 0 }
],
"color": "#FF3333", "id": "1"
}],
"id": "b79f9828-...", "date": 1766061332318
}For a single gesture, there will be only one JSON file. You cannot add multiple gestures in the same file.
Let's understand the JSON format better, this will be helpful in case you want to create a custom gesture manually instead of exporting the gesture from the Appium Inspector.
Appium gestures are implemented using the W3C Actions API through PointerInput and Sequence classes, or via platform-specific mobile commands like mobile: clickGesture for Android and iOS.
Let's now implement all the above mentioned finger gestures practically using Appium.
W3C Actions API specifications are defined in the W3C WebDriver specification. This specification gives a standard approach for any platform like Android and iOS.
W3C Actions API is exposed in PointerInput and Sequence classes which is available with Selenium WebDriver. Selenium WebDriver is internally used by Appium.
These classes rely on the session capabilities to determine the target platform and device, so configuring the right Appium capabilities ensures your gesture tests run reliably. Let's check out how to create all the different actions using the W3C Actions API.
As we looked into the Tap gesture with Appium Inspector, I will now implement the same logic using Appium Java client. Following code snippet is the implemented logic for Tap gesture:
public void tap (final WebElement element) {
final var center = getElementCenter (element);
final var finger = new PointerInput (PointerInput.Kind.TOUCH, "Finger 1");
final var sequence = new Sequence (finger, 0);
sequence.addAction (finger.createPointerMove (Duration.ofMillis (500), PointerInput.Origin.viewport (), center.getX (), center.getY ()));
sequence.addAction (finger.createPointerDown (PointerInput.MouseButton.LEFT.asArg ()));
sequence.addAction (finger.createPointerUp (PointerInput.MouseButton.LEFT.asArg ()));
this.driver.perform (Collections.singletonList (sequence));
}
The getElementCenter helper calculates the center using the element's location and size. It divides width and height by 2 and adds the element's X and Y coordinates.
A long press gesture extends the tap by adding a pause between the pointer down and pointer up actions. The pause duration controls how long the finger holds before releasing. Here is the implementation:
public void longPress (final WebElement element, final Duration holdDuration) {
final var center = getElementCenter (element);
final var finger = new PointerInput (PointerInput.Kind.TOUCH, "Finger 1");
final var sequence = new Sequence (finger, 0);
sequence.addAction (finger.createPointerMove (Duration.ofMillis (500), PointerInput.Origin.viewport (), center.getX (), center.getY ()));
sequence.addAction (finger.createPointerDown (PointerInput.MouseButton.LEFT.asArg ()));
sequence.addAction (new Pause (finger, holdDuration));
sequence.addAction (finger.createPointerUp (PointerInput.MouseButton.LEFT.asArg ()));
this.driver.perform (Collections.singletonList (sequence));
}The key difference from a tap is the Pause action between pointer down and pointer up. Pass the desired hold duration (e.g., Duration.ofSeconds(2)) to control how long the press is held before release.
Now let's check out the implementation for swipe up gesture using the Java client:
public void swipeUp (final WebElement element, final int distance) {
final var start = getSwipeStartPosition (element);
final var end = getSwipeEndPosition (new Point (0, -1), element, distance);
final var finger = new PointerInput (PointerInput.Kind.TOUCH, "Finger 1");
final var sequence = new Sequence (finger, 0);
sequence.addAction (finger.createPointerMove (Duration.ofMillis (500), PointerInput.Origin.viewport (), start.getX (), start.getY ()));
sequence.addAction (finger.createPointerDown (PointerInput.MouseButton.LEFT.asArg ()));
sequence.addAction (new Pause (finger, Duration.ofMillis (500)));
sequence.addAction (finger.createPointerMove (Duration.ofMillis (300), PointerInput.Origin.viewport (), end.getX (), end.getY ()));
sequence.addAction (finger.createPointerUp (PointerInput.MouseButton.LEFT.asArg ()));
this.driver.perform (Collections.singletonList (sequence));
}The direction point (0, -1) moves the finger upward on the Y-axis. The helper methods getSwipeStartPosition and getSwipeEndPosition calculate coordinates based on element center or screen center and a distance percentage (1-100).
The swipe down implementation is mostly identical to the swipe up implementation we saw in the previous section. The only difference here is the direction point which is used in the swipe down. Let's see the main difference in the swipe down implementation:
final var direction = new Point (0, 1);The finger will be moving downwards when performing swipe down, this means that the Y-axis coordinates will increase to emulate the swipe down gesture.
Now let's check out how the swipe left gesture is implemented.
final var direction = new Point (-1, 0);As you can see, the swipe direction point coordinate on the X-axis will be affected. The finger gesture will start at a particular point on the X-axis and will move towards the left by reducing the X-axis coordinate.
Hence the direction X value is set as -1.
The swipe right gesture implementation is exactly identical to swipe left gesture, except for the direction point coordinate. Let's see what the exact differences in implementation are.
final var direction = new Point (1, 0);Since the gesture moves towards the right direction, this means that the finger gesture will start from a particular X-axis coordinate and move towards right by increasing the X-axis coordinate value, hence the direction X value is set as 1.
The drag and drop implementation is exactly the same as any swipe gesture. The only difference is that there are two elements, one which will be dragged and the other where the first element will be dropped.
public void dragDrop (final WebElement source, final WebElement target) {
final var sourceCenter = getElementCenter (source);
final var targetCenter = getElementCenter (target);
final var finger = new PointerInput (PointerInput.Kind.TOUCH, "Finger 1");
final var sequence = new Sequence (finger, 0);
sequence.addAction (finger.createPointerMove (Duration.ofMillis (500), PointerInput.Origin.viewport (), sourceCenter.getX (), sourceCenter.getY ()));
sequence.addAction (finger.createPointerDown (PointerInput.MouseButton.LEFT.asArg ()));
sequence.addAction (new Pause (finger, Duration.ofMillis (500)));
sequence.addAction (finger.createPointerMove (Duration.ofMillis (300), PointerInput.Origin.viewport (), targetCenter.getX (), targetCenter.getY ()));
sequence.addAction (finger.createPointerUp (PointerInput.MouseButton.LEFT.asArg ()));
this.driver.perform (Collections.singletonList (sequence));
}The source element center is the starting coordinate and the target element center is the ending coordinate for the drag and drop gesture.
Zoom in gesture is a combined gesture of swipe up done by the index finger and swipe down is done by the thumb finger. Let's see the implementation for this gesture.
public void zoomIn (final WebElement element, final int distance) {
final var thumbStart = getSwipeStartPosition (element);
final var thumbEnd = getSwipeEndPosition (new Point (0, 1), element, distance);
// Thumb finger: starts at center+5px, swipes downward
final var thumbFinger = new PointerInput (PointerInput.Kind.TOUCH, "Thumb Finger");
final var thumbSequence = buildSequence (thumbFinger, thumbStart.getY () + 5, thumbEnd);
final var indexStart = getSwipeStartPosition (element);
final var indexEnd = getSwipeEndPosition (new Point (0, -1), element, distance);
// Index finger: starts at center-5px, swipes upward
final var indexFinger = new PointerInput (PointerInput.Kind.TOUCH, "Index Finger");
final var indexSequence = buildSequence (indexFinger, indexStart.getY () - 5, indexEnd);
this.driver.perform (Arrays.asList (thumbSequence, indexSequence));
}Both fingers start 5 pixels apart from the center and move in opposite directions simultaneously. The thumb swipes downward while the index finger swipes upward.
Zoom out gesture implementation is completely inverse to Zoom In. The index finger will move downwards increasing the Y-axis coordinate and thumb finger will move upwards decreasing the Y-axis coordinate.
public void zoomOut (final WebElement element, final int distance) {
// Inverse of zoomIn: start and end positions are swapped
final var thumbEnd = getSwipeStartPosition (element);
final var thumbStart = getSwipeEndPosition (new Point (0, -1), element, distance);
final var thumbFinger = new PointerInput (PointerInput.Kind.TOUCH, "Thumb Finger");
final var thumbSequence = buildSequence (thumbFinger, thumbStart.getY () + 5, thumbEnd);
final var indexEnd = getSwipeStartPosition (element);
final var indexStart = getSwipeEndPosition (new Point (0, 1), element, distance);
final var indexFinger = new PointerInput (PointerInput.Kind.TOUCH, "Index Finger");
final var indexSequence = buildSequence (indexFinger, indexStart.getY () - 5, indexEnd);
this.driver.perform (Arrays.asList (thumbSequence, indexSequence));
}The zoom out gesture is the inverse of zoom in. The start and end positions are swapped so both fingers move inward toward the center instead of apart.
Apart from W3C Actions gestures, there is another way of performing gestures, which is platform driver specific gesture commands that can be executed using the executeScript method.
Android:
Following are the supported gesture commands for the Android platform using the executeScript method. If you are new to Android automation, refer to this guide on how to automate Android apps using Appium.
Following are the code examples for each Android gesture command:
// Tap
public void tap (final WebElement element) {
final var id = ((RemoteWebElement) element).getId ();
this.driver.executeScript ("mobile: clickGesture", ImmutableMap.of ("elementId", id));
}
// Drag and Drop
public void dragDrop (final WebElement source, final WebElement target) {
final var sourceCenter = getElementCenter (source);
final var targetCenter = getElementCenter (target);
this.driver.executeScript ("mobile: dragGesture", ImmutableMap.of (
"startX", sourceCenter.getX (), "startY", sourceCenter.getY (),
"endX", targetCenter.getX (), "endY", targetCenter.getY ()));
}
// Swipe
public void swipe (final WebElement element, final String direction, final int percentage) {
final var params = ImmutableMap.<String, Object>builder ()
.put ("direction", direction).put ("percent", percentage / 100.0);
if (element != null) {
params.put ("elementId", ((RemoteWebElement) element).getId ());
}
this.driver.executeScript ("mobile: swipeGesture", params.build ());
}iOS:
Following are the supported gesture commands for the iOS platform. Refer to this guide on how to automate iOS apps using Appium for the full iOS automation workflow.
Following are the code examples for each iOS gesture command:
// Tap
public void tap (final WebElement element) {
final var id = ((RemoteWebElement) element).getId ();
this.driver.executeScript ("mobile: tap", ImmutableMap.of ("elementId", id));
}
// Swipe
public void swipe (final WebElement element, final String direction, final int speed) {
final var params = ImmutableMap.<String, Object>builder ()
.put ("direction", direction).put ("velocity", speed);
if (!isNull (element)) {
params.put ("elementId", ((RemoteWebElement) element).getId ());
}
this.driver.executeScript ("mobile: swipe", params.build ());
}
// Drag and Drop
public void dragDrop (final WebElement source, final WebElement target) {
this.driver.executeScript ("mobile: dragFromToWithVelocity", ImmutableMap.of (
"fromElementId", ((RemoteWebElement) source).getId (),
"toElementId", ((RemoteWebElement) target).getId (),
"pressDuration", 0.5, "holdDuration", 0.5, "velocity", 500));
}
// Zoom In
public void zoomIn (final WebElement element) {
final var params = ImmutableMap.builder ()
.put ("scale", 2.0).put ("velocity", 1.0);
if (!isNull (element)) {
params.put ("elementId", ((RemoteWebElement) element).getId ());
}
this.driver.executeScript ("mobile: pinch", params.build ());
}The Appium Gesture Plugin is a third-party plugin that supports swipe, drag and drop, double tap, and long press gestures. However, when I tested it, the plugin was not actively maintained and its swipe commands were flaky. It is not recommended for production use.
Use the W3C Actions API or platform-specific mobile commands instead. With Appium 3 introducing improved plugin architecture, gesture support is expected to improve further. Learn more about Appium 3 features.
Sai Krishna, an Appium Contributor and Open Source Contributor with deep expertise in Appium, WebDriver, and framework design patterns, covers the key Appium 3 changes in the video below:
With the UiSelector locator strategy, there is a way that would automatically scroll to the target element while finding the element if that element is not visible in the viewport. This approach is only available for the Android platform. Refer to this guide on locators in Appium for all available locator strategies.
Following is an example of the locator strategy which you can use to find and scroll the target element into viewport:
private final By scrolledSelectorLogo = AppiumBy.androidUIAutomator (
"new UiScrollable(new UiSelector().scrollable(true)" +
").setAsHorizontalList().setMaxSearchSwipes(5).scrollIntoView(new UiSelector().description(\"WebdriverIO logo\"))");Let's break down the locator so it becomes easy to understand. First we will find the scrollable element container using UiScrollable which will have our target element somewhere where that element is not visible to us.
We can set the direction for horizontal scrolling by calling the method setAsHorizontalList and setAsVerticalList method for vertical scrolling.
I usually set the maximum scroll count by calling the method setMaxSearchSwipes with the max scroll count number as its parameter.
Then, in the scrollIntoView method, we will use UiSelector to find our target element.
When you try to find this element locator, Appium will automatically find the target element from that scrollable container and it will scroll to that element into the viewport.
To test gestures with Appium on real devices, connect your Appium scripts to TestMu AI real device cloud, which provides access to real Android and iOS devices across multiple OS versions.
Gesture behavior varies across device manufacturers, screen sizes, and OS versions. Testing on emulators alone does not ensure that swipe distances, touch coordinates, or pinch gestures will behave identically on real hardware.
Platforms such as TestMu AI (Formerly LambdaTest) provide access to real Android and iOS devices in the cloud, allowing you to validate gesture accuracy across different screen resolutions and platform versions without maintaining a local device lab. You can test gestures with Appium on a real device cloud directly from your Appium scripts.
To get started, refer to Appium Java Testing With TestMu AI for the complete setup.
Step 1: Upload your APK (Android) or IPA (iOS) file to the TestMu AI server.
You can find the upload file option on the TestMu AI App Automation dashboard as shown in the screenshot below:

Step 2: Click the upload button and select your application file from the upload modal.

Once you select the target application file for the corresponding platform, you will see the app_url in the box below the upload button.
Step 3: Copy the returned app_url and use it in your capabilities. You can generate these capabilities using the TestMu AI Capabilities Generator.
private Capabilities buildCapabilities () {
final Map<String, Object> ltOptions = new HashMap<> ();
ltOptions.put ("w3c", true);
ltOptions.put ("platformName", "android");
ltOptions.put ("deviceName", "Pixel 9 Pro");
ltOptions.put ("platformVersion", "15");
ltOptions.put ("app", "AndroidApp");
ltOptions.put ("isRealMobile", true);
// ... additional options: visual, network, video, build, name, project
final var options = new UiAutomator2Options ();
options.setCapability ("lt:options", ltOptions);
return options;
}The app capability value AndroidApp is a custom ID assigned during upload. This ID persists across app version updates, so your test scripts do not need changes when you upload a newer build. If you are using TestNG as your test runner, this Appium with TestNG tutorial walks through the complete integration.
Step 4: Initialize the Appium driver using the capabilities and your TestMu AI credentials:
private static final String ACCESS_KEY = System.getenv ("LT_ACCESS_KEY");
private static final String SERVER_URL = "https://{0}:{1}@mobile-hub.lambdatest.com/wd/hub";
private static final String USERNAME = System.getenv ("LT_USERNAME");
. . .
final var capabilities = buildCapabilities ();
this.driver = new AndroidDriver (new URL (MessageFormat.format (SERVER_URL, USERNAME, ACCESS_KEY)),
capabilities);Your username and access key are available on the TestMu AI dashboard sidebar. Store them as environment variables instead of hardcoding in scripts. The session URL follows this format:
https://[username]:[access_key]@mobile-hub.lambdatest.com/wd/hubStep 5: Run your gesture tests. The TestMu AI App Automation dashboard captures each gesture step along with video recordings, Appium logs, device logs, and network logs for debugging failures on specific devices.

With W3C Actions API support in Appium, you can ensure the consistency of the finger gestures across different platforms like Android and iOS mobile and tablet devices.
Following are the tips which you can implement in your automation to ensure cross platform consistency:
While building these reusable gesture utilities, the Appium commands cheat sheet serves as a handy reference for method signatures and supported parameters.
Debugging gesture failures requires visualizing where the finger touches the screen. Following are the approaches I use to trace and debug gesture coordinates on Android and iOS devices:
If gestures fail even with correct coordinates, the issue may lie in your Appium environment itself. Running Appium Doctor helps verify driver installations and system dependencies before you start debugging gesture logic.
This guide covered all the approaches which can be used to perform different finger gestures on Android and iOS devices using Appium. Some approaches are robust, flexible, and stable while others can be flaky and restrictive to use.
The W3C Actions API remains the most reliable and platform-independent way to implement gestures. Combined with TestMu AI real device cloud, you can validate gesture behavior across a wide range of Android and iOS devices without maintaining local infrastructure. Once your gesture tests are stable, scale execution by running Appium parallel tests across multiple devices simultaneously.
Did you find this page helpful?
More Related Hubs
TestMu AI forEnterprise
Get access to solutions built on Enterprise
grade security, privacy, & compliance