Audio¶
Auto-generated documentation for musicalgestures._audio module.
- Mgt-python / Modules / Musicalgestures / Audio
MgAudio¶
Class container for audio analysis processes.
MgAudio().descriptors¶
def descriptors(
n_mels=128,
fmin=0.0,
fmax=None,
power=2,
dpi=300,
autoshow=True,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure of plots showing spectral/loudness descriptors, including RMS energy, spectral flatness, centroid, bandwidth, rolloff of the video/audio file.
Arguments¶
n_mels
int, optional - The number of mel filters to use for filtering the frequency domain. Affects the vertical resolution (sharpness) of the spectrogram. NB: Too high values with relatively small window sizes can result in artifacts (typically black lines) in the resulting image. Defaults to 128.fmin
float, optional - Lowest frequency (in Hz). Defaults to 0.0.fmax
float, optional - Highest frequency (in Hz). Defaults to None, use fmax = sr / 2.0power
float, optional - The steepness of the curve for the color mapping. Defaults to 2.dpi
int, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshow
bool, optional - Whether to show the resulting figure automatically. Defaults to True.original_time
bool, optional - Whether to plot original time or not. This parameter can be useful if the file has been shortened beforehand (e.g. skip). Defaults to False.title
str, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_name
str, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_descriptors.png" should be used).overwrite
bool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure
- An MgFigure object referring to the internal figure and its data.
MgAudio().format_time¶
Format time for audio plotting of video file. This is useful if one wants to plot the original time of the video when frames have been skipped beforehand.
Arguments¶
ax
str, optional - Axis of the figure.original_time
bool, optional - Whether to get the original time for audio plotting or not. Defaults to True.original_duration
bool, optional - Whether to add the original duration of the file to be formatted manually. Defaults to None.
MgAudio().hpss¶
def hpss(
dim=2,
n_mels=128,
fmin=0.0,
fmax=None,
kernel_size=31,
margin=(1.0, 5.0),
power=2.0,
top_db=80.0,
mask=False,
residual=False,
dpi=300,
autoshow=True,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure with a plots of harmonic and percussive components of the audio file.
Arguments¶
dim
str, optional - Whether to plot hpss in one (i.e. waveform) or two (i.e. spectrogram) dimensions. Defaults to 2.n_mels
int, optional - Number of Mel bands to generate. Defaults to 128.fmin
float, optional - Lowest frequency (in Hz). Defaults to 0.0.fmax
float, optional - Highest frequency (in Hz). Defaults to None, use fmax = sr / 2.0. kernel_size (int or tuple, optional): Kernel size(s) for the median filters. If tuple, the first value specifies the width of the harmonic filter, and the second value specifies the width of the percussive filter. Defaults to 31. margin (float or tuple, optional): Margin size(s) for the masks (as described in this paper). If tuple, the first value specifies the margin of the harmonic mask, and the second value specifies the margin of the percussive mask. Defaults to (1.0,5.0).power
float, optional - Exponent for the Wiener filter when constructing soft mask matrices. Defaults to 2.0.top_db
float, optional - threshold the output at top_db below the peak: max(20 * log10(S/ref)) - top_db. Defaults to 80.0.mask
bool, optional - Return the masking matrices instead of components. Defaults to False.residual
bool, optional - Whether to return residual components of the audio file or not. Defaults to False.dpi
int, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshow
bool, optional - Whether to show the resulting figure automatically. Defaults to True.original_time
bool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to False.title
str, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_name
str, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_tempogram.png" should be used).overwrite
bool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure
- An MgFigure object referring to the internal figure and its data.
MgAudio().numpy¶
Read the original file of the MgAudio object as a numpy array using librosa.
MgAudio().spectrogram¶
def spectrogram(
fmin=0.0,
fmax=None,
n_mels=128,
power=2.0,
top_db=80.0,
dpi=300,
autoshow=True,
raw=False,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure showing the mel-scaled spectrogram of the video/audio file.
Arguments¶
n_mels
int, optional - The number of filters to use for filtering the frequency domain. Affects the vertical resolution (sharpness) of the spectrogram. NB: Too high values with relatively small window sizes can result in artifacts (typically black lines) in the resulting image. Defaults to 128.fmin
float, optional - Lowest frequency (in Hz). Defaults to 0.0.fmax
float, optional - Highest frequency (in Hz). Defaults to None, use fmax = sr / 2.0.power
float, optional - The steepness of the curve for the color mapping. Defaults to 2.top_db
float, optional - threshold the output at top_db below the peak: max(20 * log10(S/ref)) - top_db. Defaults to 80.0.dpi
int, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshow
bool, optional - Whether to show the resulting figure automatically. Defaults to True.raw
bool, optional - Whether to show labels and ticks on the plot. Defaults to False.original_time
bool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to False.title
str, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_name
str, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_spectrogram.png" should be used).overwrite
bool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure
- An MgFigure object referring to the internal figure and its data.
MgAudio().tempogram¶
def tempogram(
dpi=300,
autoshow=True,
raw=False,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure with a plots of onset strength and tempogram of the video/audio file.
Arguments¶
dpi
int, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshow
bool, optional - Whether to show the resulting figure automatically. Defaults to True.raw
bool, optional - Whether to show labels and ticks on the plot. Defaults to False.original_time
bool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to False.title
str, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_name
str, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_tempogram.png" should be used).overwrite
bool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure
- An MgFigure object referring to the internal figure and its data.
MgAudio().waveform¶
def waveform(
dpi=300,
autoshow=True,
raw=False,
colored=False,
image_width=2500,
image_height=500,
fmin=500,
fmax=None,
cmap='freesound',
original_time=True,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure showing the waveform of the video/audio file.
Arguments¶
dpi
int, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshow
bool, optional - Whether to show the resulting figure automatically. Defaults to True.raw
bool, optional - Whether to show labels and ticks on the plot. Defaults to False.colored
bool, optional - Whether to create a colored waveform image (freesound-style) from an audio input file. Defauts to False.image_width
int, optional - Number of pixels for the colored waveform image width. Defaults to 2500.image_height
int, optional - Number of pixels for the colored waveform image height. Defaults to 500.fmin
int, optional - Minimum frequency for computing spectral centroid for the colored waveform image. Defaults to 500.fmax
int, optional - Maximum frequency for computing spectral centroid for the colored waveform image. Defaults to None (i.e. Nyquist frequency).cmap
str, optional - Colormap used for coloring the waveform, all colormaps included with matplotlib can be used. Defaults to 'freesound'.original_time
bool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to True.title
str, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_name
str, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_waveform.png" should be used).overwrite
bool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure
- An MgFigure object referring to the internal figure and its data.