Audio¶
Auto-generated documentation for musicalgestures._audio module.
- Mgt-python / Modules / Musicalgestures / Audio
MgAudio¶
Class container for audio analysis processes.
MgAudio().descriptors¶
def descriptors(
n_mels=128,
fmin=0.0,
fmax=None,
power=2,
dpi=300,
autoshow=True,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure of plots showing spectral/loudness descriptors, including RMS energy, spectral flatness, centroid, bandwidth, rolloff of the video/audio file.
Arguments¶
n_melsint, optional - The number of mel filters to use for filtering the frequency domain. Affects the vertical resolution (sharpness) of the spectrogram. NB: Too high values with relatively small window sizes can result in artifacts (typically black lines) in the resulting image. Defaults to 128.fminfloat, optional - Lowest frequency (in Hz). Defaults to 0.0.fmaxfloat, optional - Highest frequency (in Hz). Defaults to None, use fmax = sr / 2.0powerfloat, optional - The steepness of the curve for the color mapping. Defaults to 2.dpiint, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshowbool, optional - Whether to show the resulting figure automatically. Defaults to True.original_timebool, optional - Whether to plot original time or not. This parameter can be useful if the file has been shortened beforehand (e.g. skip). Defaults to False.titlestr, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_namestr, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_descriptors.png" should be used).overwritebool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure- An MgFigure object referring to the internal figure and its data.
MgAudio().format_time¶
Format time for audio plotting of video file. This is useful if one wants to plot the original time of the video when frames have been skipped beforehand.
Arguments¶
axstr, optional - Axis of the figure.original_timebool, optional - Whether to get the original time for audio plotting or not. Defaults to True.original_durationbool, optional - Whether to add the original duration of the file to be formatted manually. Defaults to None.
MgAudio().hpss¶
def hpss(
dim=2,
n_mels=128,
fmin=0.0,
fmax=None,
kernel_size=31,
margin=(1.0, 5.0),
power=2.0,
top_db=80.0,
mask=False,
residual=False,
dpi=300,
autoshow=True,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure with a plots of harmonic and percussive components of the audio file.
Arguments¶
dimstr, optional - Whether to plot hpss in one (i.e. waveform) or two (i.e. spectrogram) dimensions. Defaults to 2.n_melsint, optional - Number of Mel bands to generate. Defaults to 128.fminfloat, optional - Lowest frequency (in Hz). Defaults to 0.0.fmaxfloat, optional - Highest frequency (in Hz). Defaults to None, use fmax = sr / 2.0. kernel_size (int or tuple, optional): Kernel size(s) for the median filters. If tuple, the first value specifies the width of the harmonic filter, and the second value specifies the width of the percussive filter. Defaults to 31. margin (float or tuple, optional): Margin size(s) for the masks (as described in this paper). If tuple, the first value specifies the margin of the harmonic mask, and the second value specifies the margin of the percussive mask. Defaults to (1.0,5.0).powerfloat, optional - Exponent for the Wiener filter when constructing soft mask matrices. Defaults to 2.0.top_dbfloat, optional - threshold the output at top_db below the peak: max(20 * log10(S/ref)) - top_db. Defaults to 80.0.maskbool, optional - Return the masking matrices instead of components. Defaults to False.residualbool, optional - Whether to return residual components of the audio file or not. Defaults to False.dpiint, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshowbool, optional - Whether to show the resulting figure automatically. Defaults to True.original_timebool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to False.titlestr, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_namestr, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_tempogram.png" should be used).overwritebool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure- An MgFigure object referring to the internal figure and its data.
MgAudio().numpy¶
Read the original file of the MgAudio object as a numpy array using librosa.
MgAudio().spectrogram¶
def spectrogram(
fmin=0.0,
fmax=None,
n_mels=128,
power=2.0,
top_db=80.0,
dpi=300,
autoshow=True,
raw=False,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure showing the mel-scaled spectrogram of the video/audio file.
Arguments¶
n_melsint, optional - The number of filters to use for filtering the frequency domain. Affects the vertical resolution (sharpness) of the spectrogram. NB: Too high values with relatively small window sizes can result in artifacts (typically black lines) in the resulting image. Defaults to 128.fminfloat, optional - Lowest frequency (in Hz). Defaults to 0.0.fmaxfloat, optional - Highest frequency (in Hz). Defaults to None, use fmax = sr / 2.0.powerfloat, optional - The steepness of the curve for the color mapping. Defaults to 2.top_dbfloat, optional - threshold the output at top_db below the peak: max(20 * log10(S/ref)) - top_db. Defaults to 80.0.dpiint, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshowbool, optional - Whether to show the resulting figure automatically. Defaults to True.rawbool, optional - Whether to show labels and ticks on the plot. Defaults to False.original_timebool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to False.titlestr, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_namestr, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_spectrogram.png" should be used).overwritebool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure- An MgFigure object referring to the internal figure and its data.
MgAudio().tempogram¶
def tempogram(
dpi=300,
autoshow=True,
raw=False,
original_time=False,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure with a plots of onset strength and tempogram of the video/audio file.
Arguments¶
dpiint, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshowbool, optional - Whether to show the resulting figure automatically. Defaults to True.rawbool, optional - Whether to show labels and ticks on the plot. Defaults to False.original_timebool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to False.titlestr, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_namestr, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_tempogram.png" should be used).overwritebool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure- An MgFigure object referring to the internal figure and its data.
MgAudio().waveform¶
def waveform(
dpi=300,
autoshow=True,
raw=False,
colored=False,
image_width=2500,
image_height=500,
fmin=500,
fmax=None,
cmap='freesound',
original_time=True,
title=None,
target_name=None,
overwrite=False,
):
Renders a figure showing the waveform of the video/audio file.
Arguments¶
dpiint, optional - Image quality of the rendered figure in DPI. Defaults to 300.autoshowbool, optional - Whether to show the resulting figure automatically. Defaults to True.rawbool, optional - Whether to show labels and ticks on the plot. Defaults to False.coloredbool, optional - Whether to create a colored waveform image (freesound-style) from an audio input file. Defauts to False.image_widthint, optional - Number of pixels for the colored waveform image width. Defaults to 2500.image_heightint, optional - Number of pixels for the colored waveform image height. Defaults to 500.fminint, optional - Minimum frequency for computing spectral centroid for the colored waveform image. Defaults to 500.fmaxint, optional - Maximum frequency for computing spectral centroid for the colored waveform image. Defaults to None (i.e. Nyquist frequency).cmapstr, optional - Colormap used for coloring the waveform, all colormaps included with matplotlib can be used. Defaults to 'freesound'.original_timebool, optional - Whether to plot original time or not. This parameter can be useful if the video file has been shortened beforehand (e.g. skip). Defaults to True.titlestr, optional - Optionally add title to the figure. Possible to set the filename as the title using the string 'filename'. Defaults to None.target_namestr, optional - The name of the output image. Defaults to None (which assumes that the input filename with the suffix "_waveform.png" should be used).overwritebool, optional - Whether to allow overwriting existing files or to automatically increment target filenames to avoid overwriting. Defaults to False.
Returns¶
MgFigure- An MgFigure object referring to the internal figure and its data.