MediaPipe is a framework for building cross-platform multi-modal application ML pipelines, including fast ML inference, classical computer vision, and media content processing (e.g., video decoding). Below is an example diagram of MediaPipe for object detection and tracking, consisting of 4 computational nodes: a PacketResampler calculator; a previously published ObjectDetection subgraph; an ObjectTracking subgraph surrounding the above BoxTracking subgraph; and a Renderer subgraph for drawing visualization effects.

The ObjectDetection subgraph runs only on request, for example, at any frame rate or triggered by a specific signal. More specifically, before passing video frames to ObjectDetection, the PacketResampler in this example temporarily samples them at 0.5 fps. You can configure this option in PacketResampler to different frame rates. This ensures less temporal jitter during recognition and maintains object IDs across frames.
MediaPipe open-source address: https://github.com/google/mediapipe
Step 1: Install the MediaPipe Framework¶
Install dependency environment.
sudo apt-get update && sudo apt-get install -y build-essential git python zip adb openjdk-8-jdk
Install Bazel build environment, as MediaPipe is compiled using Bazel.
curl -sLO --retry 5 --retry-max-time 10 \
https://storage.googleapis.com/bazel/2.0.0/release/bazel-2.0.0-installer-linux-x86_64.sh && \
sudo mkdir -p /usr/local/bazel/2.0.0 && \
chmod 755 bazel-2.0.0-installer-linux-x86_64.sh && \
sudo ./bazel-2.0.0-installer-linux-x86_64.sh --prefix=/usr/local/bazel/2.0.0 && \
source /usr/local/bazel/2.0.0/lib/bazel/bin/bazel-complete.bash
/usr/local/bazel/2.0.0/lib/bazel/bin/bazel version && \
alias bazel='/usr/local/bazel/2.0.0/lib/bazel/bin/bazel'
Install the ADB command, and Windows users should install the same version of ADB. For Windows, download the corresponding version from: https://dl.google.com/android/repository/platform-tools_r26.0.1-windows.zip
sudo apt-get install android-tools-adb
adb version
# Android Debug Bridge version 1.0.39
Clone the MediaPipe source code.
git clone https://github.com/google/mediapipe.git
cd mediapipe
Install the OpenCV environment with the following command:
sudo apt-get install libopencv-core-dev libopencv-highgui-dev \
libopencv-calib3d-dev libopencv-features2d-dev \
libopencv-imgproc-dev libopencv-video-dev
Execute the following command to test if the environment is installed successfully:
export GLOG_logtostderr=1
bazel run --define MEDIAPIPE_DISABLE_GPU=1 \
mediapipe/examples/desktop/hello_world:hello_world
If the environment is installed successfully, the following information will be output:
I20200707 09:21:50.275205 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.276554 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.276665 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.276768 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.276887 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.277523 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.278563 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.279263 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.279850 16138 hello_world.cc:56] Hello World!
I20200707 09:21:50.280354 16138 hello_world.cc:56] Hello World!
Step 2: Compile the MediaPipe Android AAR Package¶
Execute the following script in the root directory of mediapipe to install the Android SDK and NDK. During installation, you need to accept the license agreement by entering y. After executing the script, verify that the SDK and NDK have been downloaded to the specified directory.
chmod +x ./setup_android_sdk_and_ndk.sh
bash ./setup_android_sdk_and_ndk.sh ~/Android/Sdk ~/Android/Ndk r18b
If you encounter the $'\r': command not found error (common when using Windows to clone the repository), execute:
vim setup_android_sdk_and_ndk.sh
:set ff=unix
:wq
Add SDK and NDK environment variables. Based on the parameters entered during the script execution, the directories are as follows:
vim ~/.bashrc
Add the following lines (replace test with your actual username):
export ANDROID_HOME=$PATH:/home/test/Android/Sdk
export ANDROID_NDK_HOME=$PATH:/home/test/Android/Ndk/android-ndk-r18b
Execute source ~/.bashrc to apply the changes.
Create a build file for MediaPipe to generate an Android AAR:
cd mediapipe/examples/android/src/java/com/google/mediapipe/apps/
mkdir build_aar && cd build_aar
vim BUILD
The content of the BUILD file is as follows:
load("//mediapipe/java/com/google/mediapipe:mediapipe_aar.bzl", "mediapipe_aar")
mediapipe_aar(
name = "mediapipe_hand_tracking",
calculators = ["//mediapipe/graphs/hand_tracking:mobile_calculators"],
)
name: Name of the generated AAR.calculators: Models and computational units to use. Other available models and calculators can be found in themediapipe/graphs/directory. Thehand_trackingdirectory contains the hand tracking model. For computational units, check thecc_libraryin theBUILDfile of the target directory. For Android deployment, select the mobile calculators.
Return to the mediapipe root directory and execute the following command to generate the Android AAR file:
chmod -R 755 mediapipe/
bazel build -c opt --fat_apk_cpu=arm64-v8a,armeabi-v7a \
//mediapipe/examples/android/src/java/com/google/mediapipe/apps/build_aar:mediapipe_hand_tracking
The generated AAR file will be located at:
bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/build_aar/mediapipe_hand_tracking.aar
Generate the MediaPipe binary graph with the following command (replace the binary graph name as needed):
bazel build -c opt mediapipe/graphs/hand_tracking:hand_tracking_mobile_gpu_binary_graph
The generated binary graph file will be located at:
bazel-bin/mediapipe/graphs/hand_tracking/hand_tracking_mobile_gpu.binarypb
Step 3: Build the Android Project¶
-
Create a new “TestMediaPipe” project in Android Studio.
-
Copy the generated AAR file from the previous step to the
app/libs/directory:
bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/build_aar/mediapipe_hand_tracking.aar
- Copy the following files to the
app/src/main/assets/directory:
bazel-bin/mediapipe/graphs/hand_tracking/hand_tracking_mobile_gpu.binarypb
mediapipe/models:handedness.txt
mediapipe/models/hand_landmark.tflite
mediapipe/models/palm_detection.tflite
mediapipe/models/palm_detection_labelmap.txt
- Download the OpenCV SDK from:
https://github.com/opencv/opencv/releases/download/3.4.3/opencv-3.4.3-android-sdk.zip
After unzipping, copy the arm64-v8a and armeabi-v7a directories from OpenCV-android-sdk/sdk/native/libs/ to the app/src/main/jniLibs/ directory of the Android project.
- Add dependencies in
app/build.gradle:
dependencies {
implementation fileTree(dir: "libs", include: ["*.jar", '*.aar'])
implementation 'androidx.appcompat:appcompat:1.1.0'
implementation 'androidx.constraintlayout:constraintlayout:1.1.3'
testImplementation 'junit:junit:4.13'
androidTestImplementation 'androidx.test.ext:junit:1.1.1'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.2.0'
// MediaPipe dependencies
implementation 'com.google.flogger:flogger:0.3.1'
implementation 'com.google.flogger:flogger-system-backend:0.3.1'
implementation 'com.google.code.findbugs:jsr305:3.0.2'
implementation 'com.google.guava:guava:27.0.1-android'
implementation 'com.google.protobuf:protobuf-java:3.11.4'
// CameraX core library
implementation "androidx.camera:camera-core:1.0.0-alpha06"
implementation "androidx.camera:camera-camera2:1.0.0-alpha06"
}
// Set Java version to 1.8
compileOptions {
targetCompatibility = 1.8
sourceCompatibility = 1.8
}
- Add camera permissions in
AndroidManifest.xml:
<!-- For camera access -->
<uses-permission android:name="android.permission.CAMERA" />
<uses-feature android:name="android.hardware.camera" />
<uses-feature android:name="android.hardware.camera.autofocus" />
<!-- For MediaPipe -->
<uses-feature android:glEsVersion="0x00020000" android:required="true" />
- Modify the page and logic code in
MainActivity.javaandactivity_main.xml:
activity_main.xml:
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent">
<FrameLayout
android:id="@+id/preview_display_layout"
android:layout_width="match_parent"
android:layout_height="match_parent">
<TextView
android:id="@+id/no_camera_access_view"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:gravity="center"
android:text="Camera connection failed" />
</FrameLayout>
</LinearLayout>
MainActivity.java:
```java
public class MainActivity extends AppCompatActivity {
private static final String TAG = “MainActivity”;
// Resource names and stream outputs
private static final String BINARY_GRAPH_NAME = "hand_tracking_mobile_gpu.binarypb";
private static final String INPUT_VIDEO_STREAM_NAME = "input_video";
private static final String OUTPUT_VIDEO_STREAM_NAME = "output_video";
private static final String OUTPUT_HAND_PRESENCE_STREAM_NAME = "hand_presence";
private static final String OUTPUT_LANDMARKS_STREAM_NAME = "hand_landmarks";
private SurfaceTexture previewFrameTexture;
private SurfaceView previewDisplayView;
private EglManager eglManager;
private FrameProcessor processor;
private ExternalTextureConverter converter;
private CameraXPreviewHelper cameraHelper;
private boolean handPresence;
private static final boolean USE_FRONT_CAMERA = false;
private static final boolean FLIP_FRAMES_VERTICALLY = true;
static {
System.loadLibrary("mediapipe_jni");
System.loadLibrary("opencv_java3");
}
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
previewDisplayView = new SurfaceView(this);
setupPreviewDisplayView();
PermissionHelper.checkAndRequestCameraPermissions(this);
AndroidAssetUtil.initializeNativeAssetManager(this);
eglManager = new EglManager(null);
processor = new FrameProcessor(this,
eglManager.getNativeContext(),
BINARY_GRAPH_NAME,
INPUT_VIDEO_STREAM_NAME,
OUTPUT_VIDEO_STREAM_NAME);
processor.getVideoSurfaceOutput().setFlipY(FLIP_FRAMES_VERTICALLY);
processor.addPacketCallback(OUTPUT_HAND_PRESENCE_STREAM_NAME,
(packet) -> {
handPresence = PacketGetter.getBool(packet);
Log.d(TAG, "[TS:" + packet.getTimestamp() + "] Hand presence: " + handPresence);
});
processor.addPacketCallback(OUTPUT_LANDMARKS_STREAM_NAME,
(packet) -> {
try {
NormalizedLandmarkList landmarks = NormalizedLandmarkList.parseFrom(PacketGetter.getProtoBytes(packet));
if (landmarks != null && handPresence) {
Log.d(TAG, "[TS:" + packet.getTimestamp() + "] #Landmarks: " + landmarks.getLandmarkCount());
Log.d(TAG, getLandmarksDebugString(landmarks));
}
} catch (InvalidProtocolBufferException e) {
Log.e(TAG, "Error parsing landmarks: " + e);
}
});
}
@Override
protected void onResume() {
super.onResume();
converter = new ExternalTextureConverter(eglManager.getContext());
converter.setFlipY(FLIP_FRAMES_VERTICALLY);
converter.setConsumer(processor);
if (PermissionHelper.cameraPermissionsGranted(this)) {
startCamera();
}
}
@Override
protected void onPause() {
super.onPause();
converter.close();
}
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults);
PermissionHelper.onRequestPermissionsResult(requestCode, permissions, grantResults);
}
protected void onPreviewDisplaySurfaceChanged(SurfaceHolder holder, int format, int width, int height) {
Size viewSize = computeViewSize(width, height);
Size displaySize = cameraHelper.computeDisplaySizeFromViewSize(viewSize);
boolean isCameraRotated = cameraHelper.isCameraRotated();
converter.setSurfaceTextureAndAttachToGLContext(
previewFrameTexture,
isCameraRotated ? displaySize.getHeight() : displaySize.getWidth(),
isCameraRotated ? displaySize.getWidth() : displaySize.getHeight());
}
private void setupPreviewDisplayView() {
previewDisplayView.setVisibility(View.GONE);
ViewGroup viewGroup = findViewById(R.id.preview_display_layout);
viewGroup.addView(previewDisplayView);
previewDisplayView.getHolder().addCallback(new SurfaceHolder.Callback() {
@Override
public void surfaceCreated(SurfaceHolder holder) {
processor.getVideoSurfaceOutput().setSurface(holder.getSurface());
}
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
onPreviewDisplaySurfaceChanged(holder, format, width, height);
}
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
processor.getVideoSurfaceOutput().setSurface(null);
}
});
}
protected void onCameraStarted(SurfaceTexture surfaceTexture) {
previewFrameTexture = surfaceTexture;
previewDisplayView.setVisibility(View.VISIBLE);
}
protected Size cameraTargetResolution() {
return null