Sensor Tech

Passive Acoustic Footstep Classification in Tunnel Environments

Acoustic waveform visualization of footstep signatures in concrete tunnel

A human walking through a 1.8-meter concrete tunnel at normal pace — approximately 1.2 m/s — produces a seismic-acoustic signature that is, in principle, identifiable. The footstep contact impulse generates a broadband ground vibration signal peaking in the 10–200 Hz range, with the exact spectral content depending on gait, footwear, floor surface, and the acoustic properties of the tunnel structure. The signal is detectable. The harder problem is classification: distinguishing that signal from the ambient noise environment of an operational tunnel, which may include dripping or running water, rock micro-fracture events (popping sounds from stress redistribution), mechanical ventilation fans, and, in active mining contexts, heavy equipment vibration that can exceed the footstep signal by 30–40 dB.

This article covers what the technical literature reports about footstep detection accuracy, where those reports diverge from operational reality, and what the feature engineering and classifier choices look like in practice.

The Seismic Footstep Signal: What You're Actually Measuring

A footstep generates two coupled signals: an airborne acoustic pressure wave and a ground-coupled seismic wave. In tunnel environments, both propagate simultaneously, but with very different characteristics. The airborne acoustic signal travels at roughly 340 m/s, attenuates at approximately 6 dB per doubling of distance in free field, and undergoes strong multipath reflection in concrete corridors that can create standing-wave patterns in the frequency domain. The seismic ground wave travels at 200–600 m/s in concrete (depending on aggregate composition and cure state), attenuates more slowly with distance (Rayleigh wave geometric spreading follows 1/√r rather than 1/r for spherical spreading), and is less affected by airborne multipath.

For detection and localization purposes, geophones or accelerometers mounted in contact with the floor or wall are generally more reliable than microphones in high-noise-floor environments, because the ground-coupled signal competes against a different (usually lower-energy) noise floor than the airborne channel. The tradeoff is bandwidth: geophones are sensitive in the 1–200 Hz range but have limited response above that; MEMS accelerometers extend to several kHz but have higher noise floors at low frequency. The relevant footstep energy is concentrated below 200 Hz, which makes geophones a natural choice for this application, despite their bulkier form factor relative to microphones.

Feature Extraction: MFCC and Spectral Approaches

The signal processing pipeline for footstep classification follows a general acoustic event recognition architecture: preprocessing (filtering, normalization), feature extraction, and classification. The dominant feature extraction approaches in published literature for footstep and footfall detection are:

Classifier Choices and Published Accuracy

Published accuracy figures for footstep classification in controlled laboratory settings are generally high — reported detection rates of 85–95% with false positive rates below 5% appear in academic literature covering seismic footstep detection in building floors, soil surfaces, and simulated tunnel geometries. Support vector machines (SVMs) with MFCC features and random forest classifiers with multi-feature inputs are the most commonly reported approaches in older literature. More recent work uses convolutional neural networks (CNNs) on spectrogram features, with reported improvements of 5–10 percentage points over SVM baselines in controlled conditions.

These numbers deserve careful interpretation. Controlled laboratory conditions for footstep detection research typically involve quiet environments, a single known surface type, and footstep signals recorded at known short distances. Tunnel environments introduce several confounds that controlled studies rarely capture:

We want to be clear about the limits of what we can claim here: our internal field characterization data covers a limited set of surface types and tunnel geometries, and generalizing from that to universal performance figures would be misleading. The honest statement is that footstep classification in clean controlled conditions is a solved problem; footstep classification in operational tunnels with complex noise environments is still a work in progress across the field, not just for us.

The Multi-Source Noise Problem

The three primary interference sources in tunnel seismic-acoustic environments deserve individual treatment because they fail in different frequency bands and require different mitigation approaches.

Water flow: dripping water produces discrete impulse events with broadband spectral content that can masquerade as footsteps at high event rates. Running water (streamflow in drainage channels) produces continuous broadband noise from 10 Hz to several kHz with spectral characteristics that shift with flow rate. Adaptive noise cancellation using a reference sensor in the drainage channel can reduce this interference by 15–20 dB in the overlapping band, at the cost of an additional sensor and the complexity of designing an effective cancellation filter.

Rock settlement and micro-fracture: in active tunnels or those under geologically active overburden, stress redistribution produces acoustic emissions that range from single sharp clicks (micro-fracture events) to rolling low-frequency rumbles (large-scale settlement). Micro-fracture events are particularly problematic because their impulse shape is similar to a footstep heel-strike at short distances. Distinguishing them from footsteps requires either multi-node TDOA localization (to check whether the apparent source location moves coherently as a walking person would) or waveform features that exploit the different decay characteristics of mechanical fracture versus floor-coupled footstep impacts.

Mechanical ventilation: ventilation fans in tunnels typically operate at 10–60 Hz fundamental frequency with harmonics extending to several hundred Hz. Fan vibration couples to tunnel structure and produces persistent periodic signals that can saturate STFT spectrogram features in the bands most relevant for footstep detection. Notch filtering at known fan frequencies is the straightforward mitigation, but requires fan frequency monitoring (which changes with load) and complicates the signal chain.

TDOA Localization and Classification Synergy

The most operationally useful configuration for passive acoustic footstep detection in tunnel corridors is not a single-node classifier but a multi-node array with TDOA-based localization feeding into event classification. Here is why: many of the ambiguous events — rock settlement, water flow transients — produce seismic signatures that are stationary in space. A walking person produces a sequence of events that propagate along the tunnel at 1–1.5 m/s. A TDOA array with three or more nodes spaced 10–20 meters apart can compute the apparent source position for each detected event. If successive events show spatially coherent motion at human walking speed, the classification confidence goes up substantially even if the individual event signatures are ambiguous.

This architecture — where localization and classification are coupled rather than independent — is more complex to implement and requires the inter-node timing accuracy discussed in our earlier piece on GPS-denied timing. But it eliminates a whole class of false positives that single-node classification cannot resolve. The operational value of that false positive reduction is significant for tunnel threat detection: an alert threshold that triggers on every ambiguous seismic event in a high-noise-floor tunnel will be ignored after the first week of deployment.

Practical Deployment Parameters

Based on what the literature reports and what we've characterized in internal testing (with the caveats noted above), some practical starting-point parameters for tunnel footstep detection systems:

The detection problem is tractable in many operational tunnel environments. The classification problem — distinguishing human footsteps from other tunnel events with operational reliability — requires careful noise characterization specific to the deployment environment before committing to a sensor spacing and algorithm design. No footstep classifier designed for one tunnel type should be assumed to transfer without re-validation to a different tunnel, even one that appears physically similar.