Spherical matrix arrays represent an advantageous tomographic detection geometry for non-invasive deep tissue mapping of vascular networks and oxygenation with volumetric photoacoustic tomography (VPT). Hybridization of VPT with ultrasound (US) imaging remains difficult with this configuration due to the relatively large inter-element pitch of spherical arrays. We suggest a new approach for combining VPT and US contrast-enhanced 3D imaging employing injection of clinically-approved microbubbles. Power Doppler (PD) and US localization imaging were enabled with a sparse US acquisition sequence and model-based inversion based on infimal convolution of total variation (ICTV) regularization. In vitro experiments in tissue-mimicking phantoms and in living mouse brain demonstrate the powerful capabilities of the new dual-mode imaging approach attaining 80 μm spatial resolution and a more than 10 dB signal to noise improvement with respect to a classical delay and sum beamformer. Microbubble localization and tracking allowed for flow velocity mapping up to 40 mm/s.