How to Count Steps using Python Data Analysis of Acceleration Data

This post will show you how to collect acceleration data from your phone using the senDsor app and export that data for analysis with python to determine the number of steps taken.

Install The App

The senDsor app is available for Android and can be downloaded from the Google Play app store here.

Setup A Sample Rate

The sendSor app collects data from your phone’s built in sensors. Currently user acceleration, general acceleration (including gravity), angular acceleration via the gyroscope and magnetic flux density are supported. More sensors and integrations will be added in upcoming updates.

When collecting the sensor data senDsor allows you to configure the rate at which it collects the sensor data and how much data to hold in memory before overwriting. This is configured in the Sample Rate menu as shown in the screenshots below.

We will configure the sample rate at 100 milliseconds (default) with a buffer length of 60 seconds in order to collect enough data for our analysis.

Collect The Data

After the sample rate and buffer length are configured and saved, the app will start to collect and graph the data from all the available sensors. The sensor that we will use for this example is the Accelerometer sensor so the other plots can be minimized by tapping the small up arrow beside the sensor name.

When you are ready you can keep the phone in your hand at your side, or in a pocket, and start walking normally.

Export The Data

Once you are satisfied that you have walked enough you can export the data shown on the graph by tapping the save icon beside just above the graph. The data will be saved in csv format so that it can be easily analyzed with other applications.

The csv file can be saved to the cloud, emailed, transferred via bluetooth, whatsapp, etc. You can use your preferred method to get the file to your computer for analysis with python.

Import The Data to Start the Data Analysis in Python

We will use a few python data analysis packages to determine the number of steps taken. Specifically we will use pandas, numpy, scipy for data manipulation and analysis and matplotlib for plotting. Let’s import these packages and read in our csv data file:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.signal import find_peaks
import math

df = pd.read_csv('user_accelerometer_data_june29.csv')

print(df)

        Channel                        Time     Value
0     userAccelX  2022-06-29 08:58:59.587010  3.362103
1     userAccelX  2022-06-29 08:58:59.688211 -0.601368
2     userAccelX  2022-06-29 08:58:59.796047 -0.601368
3     userAccelX  2022-06-29 08:58:59.894922  0.649153
4     userAccelX  2022-06-29 08:58:59.987040  0.649153
...          ...                         ...       ...
1792  userAccelZ  2022-06-29 08:59:58.987455  0.294550
1793  userAccelZ  2022-06-29 08:59:59.087619  0.294550
1794  userAccelZ  2022-06-29 08:59:59.209171  0.160627
1795  userAccelZ  2022-06-29 08:59:59.286533  0.160627
1796  userAccelZ  2022-06-29 08:59:59.395452  0.218342

[1797 rows x 3 columns]
        

Extract Data For Each Channel

The data has three unique channels in the ‘Channel’ column therefore we have to extract them to their own dataframe for analysis. These can be shown using pandas.

print(df.Channel.unique())

['userAccelX' 'userAccelY' 'userAccelZ']

We also use pandas to extract each channel as a unique dataframe so that we can analyze them independently.

xdata = df[df.Channel == 'userAccelX'].reset_index()
ydata = df[df.Channel == 'userAccelY'].reset_index()
zdata = df[df.Channel == 'userAccelZ'].reset_index()

print(ydata)

    index     Channel                        Time     Value
0      599  userAccelY  2022-06-29 08:58:59.587054 -0.297535
1      600  userAccelY  2022-06-29 08:58:59.688299  0.193463
2      601  userAccelY  2022-06-29 08:58:59.796103  0.193463
3      602  userAccelY  2022-06-29 08:58:59.894994 -0.340354
4      603  userAccelY  2022-06-29 08:58:59.987081 -0.340354
..     ...         ...                         ...       ...
594   1193  userAccelY  2022-06-29 08:59:58.987407  0.377779
595   1194  userAccelY  2022-06-29 08:59:59.087538  0.377779
596   1195  userAccelY  2022-06-29 08:59:59.209120  0.363356
597   1196  userAccelY  2022-06-29 08:59:59.286500  0.363356
598   1197  userAccelY  2022-06-29 08:59:59.395402  0.404495

[599 rows x 4 columns]
        

Plot The Raw Accelerometer Data

We now plot the data from the three channels so we can see them visually.

fig, ax = plt.subplots(3,1)
fig.set_figheight(7.5)
fig.set_figwidth(15)

fig.suptitle("Accelerometer Data", fontsize = 30)
fig.tight_layout()

ax[0].plot(xdata.Value, 'b')
ax[0].set_ylabel('x-axis', fontdict = {'size':20})
ax[0].set_ylim(-5,5)
                
ax[1].plot(ydata.Value, 'r')
ax[1].set_ylabel('y-axis', fontdict = {'size':20})
ax[1].set_ylim(-5,5)

ax[2].plot(zdata.Value, 'g')
ax[2].set_ylabel('z-axis', fontdict = {'size':20})
ax[2].set_ylim(-5,5)


plt.show()
        

Process The Raw Accelerometer Data

Combine the data from the three axes into single magnitude scalar values. This ensures that large accelerations, such as taking a step, can be detected regardless of the orientation of the phone.

accel_mag = (xdata.Value**2 + ydata.Value**2 + zdata.Value**2).apply(lambda x: math.sqrt(x))

print(accel_mag)

0      3.725480
1      0.825860
2      0.825860
3      2.315852
4      2.315852
         ...   
594    1.567787
595    1.567787
596    1.363588
597    1.363588
598    1.407883
Name: Value, Length: 599, dtype: float64
        

Plot the Combined Acceleration Magnitude

fig, ax = plt.subplots(1,1)
fig.set_figheight(7.5)
fig.set_figwidth(15)

fig.suptitle("Accelerometer Data", fontsize = 30)
fig.tight_layout()

ax.plot(accel_mag, 'b')
ax.set_ylabel('Acceleration (m/s^2)', fontdict = {'size':20})
ax.set_ylim(0,7)

plt.show()
        

Count The Number Of Steps From The Combined Acceleration Data

The peaks in the combined acceleration data generally represent steps, therefore the number of peaks is approximately equal to the number of steps. Which peaks represent a full step can be fine-tuned with by setting a height in the peaks function. A value of 2 was chosen for this example as a result of visual inspection of the graph.

Scipy is used to determine the indices of the peaks as follows:

peaks, _ = find_peaks(accel_mag, height=2)
print(peaks)

[  5   9  13  19  23  27  34  42  46  57  71  75  85 101 105 115 119 131
 145 160 177 187 191 195 199 203 207 215 219 225 232 238 246 250 263 268
 277 280 295 303 307 313 321 327 335 341 349 353 364 370 376 380 384 390
 396 400 405 408 413 418 426 431 445 449 459 474 488 492 504 514 518 526
 533 543 547 552 562 567 580 590]
        

We can plot the peaks with the combined acceleration data to see exactly which peaks were found.

fig, ax = plt.subplots(1,1)
fig.set_figheight(7.5)
fig.set_figwidth(15)

fig.suptitle("Accelerometer Data with Peaks", fontsize = 30)
fig.tight_layout()

ax.plot(accel_mag, 'b')
ax.plot(peaks,accel_mag[peaks], "rx")
ax.set_ylabel('Acceleration (m/s^2)', fontdict = {'size':20})
ax.set_ylim(0,7)

plt.tight_layout()
plt.savefig("accelwithpeaks.png", format="png")
plt.show()
        

We can now find the number of steps by simply finding the length of the peaks array.

print("The number of steps taken is", len(peaks))

The number of steps taken is 80
        

That’s it, that’s how to count the number of steps taken using accelerometer data gathered from a smartphone using the senDsor app and python.

Don’t forget to share this post with friends who may find it useful.