Intelligent systems in interventional healthcare depend on the reliable perception of the environment. In this context, photoacoustic tomography (PAT) has emerged as a non-invasive, functional imaging modality with great clinical potential. Current research focuses on converting the high-dimensional, not human-interpretable spectral data into the underlying functional information, specifically the blood oxygenation. One of the largely unexplored issues stalling clinical advances is the fact that the quantification problem is ambiguous, i.e. that radically different tissue parameter configurations could lead to almost identical photoacoustic spectra. In the present work, we tackle this problem with conditional Invertible Neural Networks (cINNs). Going beyond traditional point estimates, our network is used to compute an approximation of the conditional posterior density of tissue parameters given the measurement. To this end, an automatic mode detection algorithm extracts the plausible solution from the sample-based posterior. According to a comprehensive validation study based on both synthetic and real images, our approach is well-suited for exploring ambiguity in quantitative PAT.