[Libav-user] Audio conversion from planar to non-planar formats

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Libav-user] Audio conversion from planar to non-planar formats

Taha Ansari
Hi!

I am working on audio conversion. Looking at different example online, I have been successfully able to convert FLTP format to S16. To my understanding, FLTP is planar, with values in floating range: -1.0 to +1.0. This converts to S16 just fine. Now, I want to convert audio coming from S16P format to S16 format.

Because of initial success with planar to S16 (FLTP to S16), I assume I am interpreting planar data correctly, but, when I convert from S16P to S16, I get lots of artifacts in destination audio (noise is not clear, and is faster than original track).

I am requesting could anyone kindly guide me on how to do successful conversion from S16P to S16 formats? I am simply amazed how such an apparently simple task is taking me forever, but still gets me nowhere.

Thanks in advance @all!

_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user
Reply | Threaded
Open this post in threaded view
|

Re: Audio conversion from planar to non-planar formats

Paul B Mahol
On 6/26/13, Taha Ansari <[hidden email]> wrote:

> Hi!
>
> I am working on audio conversion. Looking at different example online, I
> have been successfully able to convert FLTP format to S16. To my
> understanding, FLTP is planar, with values in floating range: -1.0 to +1.0.
> This converts to S16 just fine. Now, I want to convert audio coming from
> S16P format to S16 format.
>
> Because of initial success with planar to S16 (FLTP to S16), I assume I am
> interpreting planar data correctly, but, when I convert from S16P to S16, I
> get lots of artifacts in destination audio (noise is not clear, and is
> faster than original track).
>
> I am requesting could anyone kindly guide me on how to do successful
> conversion from S16P to S16 formats? I am simply amazed how such an
> apparently simple task is taking me forever, but still gets me nowhere.

It should be exactly the same as FLTP to S16. I'm not willing to play
guess games. Please provide source code you use for convert audio if you
expect someone will help you.

>
> Thanks in advance @all!
>
_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user
Reply | Threaded
Open this post in threaded view
|

Re: Audio conversion from planar to non-planar formats

Hendrik Leppkes
In reply to this post by Taha Ansari
On Wed, Jun 26, 2013 at 8:01 AM, Taha Ansari <[hidden email]> wrote:

> Hi!
>
> I am working on audio conversion. Looking at different example online, I
> have been successfully able to convert FLTP format to S16. To my
> understanding, FLTP is planar, with values in floating range: -1.0 to +1.0.
> This converts to S16 just fine. Now, I want to convert audio coming from
> S16P format to S16 format.
>
> Because of initial success with planar to S16 (FLTP to S16), I assume I am
> interpreting planar data correctly, but, when I convert from S16P to S16, I
> get lots of artifacts in destination audio (noise is not clear, and is
> faster than original track).
>
> I am requesting could anyone kindly guide me on how to do successful
> conversion from S16P to S16 formats? I am simply amazed how such an
> apparently simple task is taking me forever, but still gets me nowhere.
>
> Thanks in advance @all!
>

Here, have some (pseudo) code:

int16_t *dst = av_malloc(frame->channels * frame->nb_samples * 2);
int16_t **src = (int16_t **)frame->extended_data;
for (int i = 0; i < frame->nb_samples; i++) {
  for (int ch = 0; ch < frame->channels; ch++) {
    dst[i * frame->channels + ch] = src[ch][i];
  }
}
_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user
Reply | Threaded
Open this post in threaded view
|

Re: Audio conversion from planar to non-planar formats

Taha Ansari
Hi everyone!

>>It should be exactly the same as FLTP to S16. I'm not willing to play
>>guess games. Please provide source code you use for convert audio if you
>>expect someone will help you.

@Paul:

My code is similar to what I have been nagging since few days on the mailing list, so sorry for not providing it (attached now near end of email)!

While experimenting, I have made significant progress: now, my conversions from S16P to S16 are apparently fine (I was making a grave mistake while calling swr_convert() function). So current status is: I am able to convert between these two formats, but! only about 90% of audio is converted, and any media player just jumps ahead after reaching this mark.

Kindly review my updated code (note: I am also writing delayed frames in end, but that does not help):

-------------------------------------------------

#include "stdafx.h"

#include <iostream>
#include <fstream>

#include <string>
#include <vector>
#include <map>

#include <deque>
#include <queue>

#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <conio.h>

extern "C"
{
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libavdevice/avdevice.h"
#include "libswscale/swscale.h"
#include "libavutil/dict.h"
#include "libavutil/error.h"
#include "libavutil/opt.h"
#include <libavutil/fifo.h>
#include <libavutil/imgutils.h>
#include <libavutil/samplefmt.h>
#include <libswresample/swresample.h>
}

AVFormatContext*    fmt_ctx= NULL;
int                    audio_stream_index = -1;
AVCodecContext *    codec_ctx_audio = NULL;
AVCodec*            codec_audio = NULL;
AVFrame*            decoded_frame = NULL;
uint8_t**            audio_dst_data = NULL;
int                    got_frame = 0;
int                    audiobufsize = 0;
AVPacket            input_packet;
int                    audio_dst_linesize = 0;
int                    audio_dst_bufsize = 0;
SwrContext *        swr = NULL;

AVOutputFormat *    output_format = NULL ;
AVFormatContext *    output_fmt_ctx= NULL;
AVStream *            audio_st = NULL;
AVCodec *            audio_codec = NULL;
double                audio_pts = 0.0;
AVFrame *            out_frame = avcodec_alloc_frame();

int                    audio_input_frame_size = 0;

uint8_t *            audio_data_buf = NULL;
uint8_t *            audio_out = NULL;
int                    audio_bit_rate;
int                    audio_sample_rate;
int                    audio_channels;

int decode_packet();
int open_audio_input(char* src_filename);
int decode_frame();

int open_encoder(char* output_filename);
AVStream *add_audio_stream(AVFormatContext *oc, AVCodec **codec,
    enum AVCodecID codec_id);
int open_audio(AVFormatContext *oc, AVCodec *codec, AVStream *st);
void close_audio(AVFormatContext *oc, AVStream *st);
void write_audio_frame(uint8_t ** audio_src_data, int audio_src_bufsize);

int open_audio_input(char* src_filename)
{
    int i =0;
    /* open input file, and allocate format context */
    if (avformat_open_input(&fmt_ctx, src_filename, NULL, NULL) < 0)
    {
        fprintf(stderr, "Could not open source file %s\n", src_filename);
        exit(1);
    }

    // Retrieve stream information
    if(avformat_find_stream_info(fmt_ctx, NULL)<0)
        return -1; // Couldn't find stream information

    // Dump information about file onto standard error
    av_dump_format(fmt_ctx, 0, src_filename, 0);

    // Find the first video stream
    for(i=0; i<fmt_ctx->nb_streams; i++)
    {
        if(fmt_ctx->streams[i]->codec->codec_type==AVMEDIA_TYPE_AUDIO)
        {
            audio_stream_index=i;
            break;
        }
    }
    if ( audio_stream_index != -1 )
    {
        // Get a pointer to the codec context for the audio stream
        codec_ctx_audio=fmt_ctx->streams[audio_stream_index]->codec;

        // Find the decoder for the video stream
        codec_audio=avcodec_find_decoder(codec_ctx_audio->codec_id);
        if(codec_audio==NULL) {
            fprintf(stderr, "Unsupported audio codec!\n");
            return -1; // Codec not found
        }

        // Open codec
        AVDictionary *codecDictOptions = NULL;
        if(avcodec_open2(codec_ctx_audio, codec_audio, &codecDictOptions)<0)
            return -1; // Could not open codec

        // Set up SWR context once you've got codec information
        swr = swr_alloc();
        av_opt_set_int(swr, "in_channel_layout",  codec_ctx_audio->channel_layout, 0);
        av_opt_set_int(swr, "out_channel_layout", codec_ctx_audio->channel_layout,  0);
        av_opt_set_int(swr, "in_sample_rate",     codec_ctx_audio->sample_rate, 0);
        av_opt_set_int(swr, "out_sample_rate",    codec_ctx_audio->sample_rate, 0);
        av_opt_set_sample_fmt(swr, "in_sample_fmt",  codec_ctx_audio->sample_fmt, 0);
        av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
        swr_init(swr);

        // Allocate audio frame
        if ( decoded_frame == NULL ) decoded_frame = avcodec_alloc_frame();
        int nb_planes = 0;
        AVStream* audio_stream = fmt_ctx->streams[audio_stream_index];
        nb_planes = av_sample_fmt_is_planar(codec_ctx_audio->sample_fmt) ? codec_ctx_audio->channels : 1;
        int tempSize =  sizeof(uint8_t *) * nb_planes;
        audio_dst_data = (uint8_t**)av_mallocz(tempSize);
        if (!audio_dst_data)
        {
            fprintf(stderr, "Could not allocate audio data buffers\n");
        }
        else
        {
            for ( int i = 0 ; i < nb_planes ; i ++ )
            {
                audio_dst_data[i] = NULL;
            }
        }
    }
}


int decode_frame()
{
    int rv = 0;
    got_frame = 0;
    if ( fmt_ctx == NULL  )
    {
        return rv;
    }
    int ret = 0;
    audiobufsize = 0;
    rv = av_read_frame(fmt_ctx, &input_packet);
    if ( rv < 0 )
    {
        return rv;
    }
    rv = decode_packet();
    // Free the input_packet that was allocated by av_read_frame
    //av_free_packet(&input_packet);
    return rv;
}

int decode_packet()
{
    int rv = 0;
    int ret = 0;

    //audio stream?
    if(input_packet.stream_index == audio_stream_index)
    {
        /* decode audio frame */
        rv = avcodec_decode_audio4(codec_ctx_audio, decoded_frame, &got_frame, &input_packet);
        if (rv < 0)
        {
            fprintf(stderr, "Error decoding audio frame\n");
            //return ret;
        }
        else
        {
            if (got_frame)
            {
                if ( audio_dst_data[0] == NULL )
                {
                     ret = av_samples_alloc(audio_dst_data, &audio_dst_linesize, decoded_frame->channels,
                        decoded_frame->nb_samples, (AVSampleFormat)decoded_frame->format, 1);
                    if (ret < 0)
                    {
                        fprintf(stderr, "Could not allocate audio buffer\n");
                        return AVERROR(ENOMEM);
                    }
                    /* TODO: extend return code of the av_samples_* functions so that this call is not needed */
                    audio_dst_bufsize = av_samples_get_buffer_size(NULL, audio_st->codec->channels,
                        decoded_frame->nb_samples, (AVSampleFormat)decoded_frame->format, 1);

                    //int16_t* outputBuffer = ...;
                    swr_convert( swr, audio_dst_data, out_frame->nb_samples, (const uint8_t**) decoded_frame->extended_data, decoded_frame->nb_samples );
                }
                /* copy audio data to destination buffer:
                * this is required since rawaudio expects non aligned data */
                //av_samples_copy(audio_dst_data, decoded_frame->data, 0, 0,
                //    decoded_frame->nb_samples, decoded_frame->channels, (AVSampleFormat)decoded_frame->format);
            }
        }
    }
    return rv;
}


int open_encoder(char* output_filename )
{
    int rv = 0;

    /* allocate the output media context */
    AVOutputFormat *opfmt = NULL;

    avformat_alloc_output_context2(&output_fmt_ctx, opfmt, NULL, output_filename);
    if (!output_fmt_ctx) {
        printf("Could not deduce output format from file extension: using MPEG.\n");
        avformat_alloc_output_context2(&output_fmt_ctx, NULL, "mpeg", output_filename);
    }
    if (!output_fmt_ctx) {
        rv = -1;
    }
    else
    {
        output_format = output_fmt_ctx->oformat;
    }

    /* Add the audio stream using the default format codecs
    * and initialize the codecs. */
    audio_st = NULL;

    if ( output_fmt_ctx )
    {
        if (output_format->audio_codec != AV_CODEC_ID_NONE)
        {
            audio_st = add_audio_stream(output_fmt_ctx, &audio_codec, output_format->audio_codec);
        }

        /* Now that all the parameters are set, we can open the audio and
        * video codecs and allocate the necessary encode buffers. */
        if (audio_st)
        {
            rv = open_audio(output_fmt_ctx, audio_codec, audio_st);
            if ( rv < 0 ) return rv;
        }

        av_dump_format(output_fmt_ctx, 0, output_filename, 1);
        /* open the output file, if needed */
        if (!(output_format->flags & AVFMT_NOFILE))
        {
            if (avio_open(&output_fmt_ctx->pb, output_filename, AVIO_FLAG_WRITE) < 0) {
                fprintf(stderr, "Could not open '%s'\n", output_filename);
                rv = -1;
            }
            else
            {
                /* Write the stream header, if any. */
                if (avformat_write_header(output_fmt_ctx, NULL) < 0)
                {
                    fprintf(stderr, "Error occurred when opening output file\n");
                    rv = -1;
                }
            }
        }
    }

    return rv;
}

AVStream *add_audio_stream(AVFormatContext *oc, AVCodec **codec,
    enum AVCodecID codec_id)
{
    AVCodecContext *c;
    AVStream *st;

    /* find the audio encoder */
    *codec = avcodec_find_encoder(codec_id);
    if (!(*codec)) {
        fprintf(stderr, "Could not find codec\n");
        exit(1);
    }

    st = avformat_new_stream(oc, *codec);
    if (!st) {
        fprintf(stderr, "Could not allocate stream\n");
        exit(1);
    }
    st->id = 1;

    c = st->codec;

    /* put sample parameters */
    c->sample_fmt  = AV_SAMPLE_FMT_S16;
    c->bit_rate    = audio_bit_rate;
    c->sample_rate = audio_sample_rate;
    c->channels    = audio_channels;

    // some formats want stream headers to be separate
    if (oc->oformat->flags & AVFMT_GLOBALHEADER)
        c->flags |= CODEC_FLAG_GLOBAL_HEADER;

    return st;
}

int open_audio(AVFormatContext *oc, AVCodec *codec, AVStream *st)
{
    int ret=0;
    AVCodecContext *c;

    st->duration = fmt_ctx->duration;
    c = st->codec;

    /* open it */
    ret = avcodec_open2(c, codec, NULL) ;
    if ( ret < 0)
    {
        fprintf(stderr, "could not open codec\n");
        return -1;
        //exit(1);
    }

    if (c->codec->capabilities & CODEC_CAP_VARIABLE_FRAME_SIZE)
        audio_input_frame_size = 10000;
    else
        audio_input_frame_size = c->frame_size;
    int tempSize = audio_input_frame_size *
        av_get_bytes_per_sample(c->sample_fmt) *
        c->channels;
    return ret;
}

void close_audio(AVFormatContext *oc, AVStream *st)
{
    avcodec_close(st->codec);
}

void write_audio_frame(uint8_t ** audio_src_data, int audio_src_bufsize)
{
    AVFormatContext *oc = output_fmt_ctx;
    AVStream *st = audio_st;
    if ( oc == NULL || st == NULL ) return;
    AVCodecContext *c;
    AVPacket pkt = { 0 }; // data and size must be 0;
    int got_packet;

    av_init_packet(&pkt);
    c = st->codec;

    out_frame->nb_samples = audio_input_frame_size;
    int buf_size =         audio_src_bufsize *
        av_get_bytes_per_sample(c->sample_fmt) *
        c->channels;
    avcodec_fill_audio_frame(out_frame, c->channels, c->sample_fmt,
        (uint8_t *) *audio_src_data,
        buf_size, 1);
    avcodec_encode_audio2(c, &pkt, out_frame, &got_packet);
    if (!got_packet)
    {
    }
    else
    {
        if (pkt.pts != AV_NOPTS_VALUE)
            pkt.pts =  av_rescale_q(pkt.pts, st->codec->time_base, st->time_base);
        if (pkt.dts != AV_NOPTS_VALUE)
            pkt.dts = av_rescale_q(pkt.dts, st->codec->time_base, st->time_base);
        if ( c && c->coded_frame && c->coded_frame->key_frame)
            pkt.flags |= AV_PKT_FLAG_KEY;

         pkt.stream_index = st->index;
        pkt.flags |= AV_PKT_FLAG_KEY;
        /* Write the compressed frame to the media file. */
        if (av_interleaved_write_frame(oc, &pkt) != 0)
        {
            fprintf(stderr, "Error while writing audio frame\n");
            exit(1);
        }
    }
    av_free_packet(&pkt);
}


void write_delayed_frames(AVFormatContext *oc, AVStream *st)
{
    AVCodecContext *c = st->codec;
    int got_output = 0;
    int ret = 0;
    AVPacket pkt;
    pkt.data = NULL;
    pkt.size = 0;
    av_init_packet(&pkt);
    int i = 0;
    for (got_output = 1; got_output; i++)
    {
        ret = avcodec_encode_audio2(c, &pkt, NULL, &got_output);
        if (ret < 0)
        {
            fprintf(stderr, "error encoding frame\n");
            exit(1);
        }
        static int64_t tempPts = 0;
        static int64_t tempDts = 0;
        /* If size is zero, it means the image was buffered. */
        if (got_output)
        {
            if (pkt.pts != AV_NOPTS_VALUE)
                pkt.pts =  av_rescale_q(pkt.pts, st->codec->time_base, st->time_base);
            if (pkt.dts != AV_NOPTS_VALUE)
                pkt.dts = av_rescale_q(pkt.dts, st->codec->time_base, st->time_base);
            if ( c && c->coded_frame && c->coded_frame->key_frame)
                pkt.flags |= AV_PKT_FLAG_KEY;

            pkt.stream_index = st->index;
            /* Write the compressed frame to the media file. */
            ret = av_interleaved_write_frame(oc, &pkt);
        }
        else
        {
            ret = 0;
        }
        av_free_packet(&pkt);
    }
}

int main(int argc, char **argv)
{
    /* register all formats and codecs */
    av_register_all();
    avcodec_register_all();
    avformat_network_init();
    avdevice_register_all();
    int i =0;
    char src_filename[90] = "mp3.mp3";
    char dst_filename[90] = "test.mp4";
    open_audio_input(src_filename);
    audio_bit_rate        = codec_ctx_audio->bit_rate;
    audio_sample_rate    = codec_ctx_audio->sample_rate;
    audio_channels        = codec_ctx_audio->channels;
    open_encoder( dst_filename );
    while(1)
    {
        int rv = decode_frame();
        if ( rv < 0 )
        {
            break;
        }

        if (audio_st)
        {
            audio_pts = (double)audio_st->pts.val * audio_st->time_base.num /
                audio_st->time_base.den;
        }
        else
        {
            audio_pts = 0.0;
        }
        if ( codec_ctx_audio )
        {
            if ( got_frame)
            {
                write_audio_frame( audio_dst_data, audio_dst_bufsize );
            }
        }
        if ( audio_dst_data[0] )
        {
            av_freep(&audio_dst_data[0]);
            audio_dst_data[0] = NULL;
        }
        av_free_packet(&input_packet);
        printf("\naudio_pts: %.3f", audio_pts);
    }
    write_delayed_frames( output_fmt_ctx, audio_st );
    av_write_trailer(output_fmt_ctx);
    close_audio( output_fmt_ctx, audio_st);
    swr_free(&swr);
    avcodec_free_frame(&out_frame);
    return 0;
}

-----------------------------------------


On Wed, Jun 26, 2013 at 12:02 PM, Hendrik Leppkes <[hidden email]> wrote:
On Wed, Jun 26, 2013 at 8:01 AM, Taha Ansari <[hidden email]> wrote:
> Hi!
>
> I am working on audio conversion. Looking at different example online, I
> have been successfully able to convert FLTP format to S16. To my
> understanding, FLTP is planar, with values in floating range: -1.0 to +1.0.
> This converts to S16 just fine. Now, I want to convert audio coming from
> S16P format to S16 format.
>
> Because of initial success with planar to S16 (FLTP to S16), I assume I am
> interpreting planar data correctly, but, when I convert from S16P to S16, I
> get lots of artifacts in destination audio (noise is not clear, and is
> faster than original track).
>
> I am requesting could anyone kindly guide me on how to do successful
> conversion from S16P to S16 formats? I am simply amazed how such an
> apparently simple task is taking me forever, but still gets me nowhere.
>
> Thanks in advance @all!
>

Here, have some (pseudo) code:

int16_t *dst = av_malloc(frame->channels * frame->nb_samples * 2);
int16_t **src = (int16_t **)frame->extended_data;
for (int i = 0; i < frame->nb_samples; i++) {
  for (int ch = 0; ch < frame->channels; ch++) {
    dst[i * frame->channels + ch] = src[ch][i];
  }
}
_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user


_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user
Reply | Threaded
Open this post in threaded view
|

Re: Audio conversion from planar to non-planar formats

Taha Ansari
I tried an experiment: mp3.mp3 file as input, and test.mp3 file as output (output format from S16 to S16P), it is getting converted (transcoded) just fine, but I can't understand what is exactly happening with conversion from mp3 to mp4.

Only one thing I think might be correlated: MP3 with 128 kbps gets me nb_samples equal to 1152, and MP4 (AAC-destination) has nb_samples equal to 1024, so is it possible some frames from mp3 to mp4 are being buffered inappropriately, and never get written to destination mp4 file? Considering the fact 1-(1024/1152)*100 is about 11.11%, about the same amount 'not' written to mp4 near end.

Please can someone help me fix this?

_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user
Reply | Threaded
Open this post in threaded view
|

Re: Audio conversion from planar to non-planar formats

Stefano Sabatini-2
In reply to this post by Taha Ansari
In data Wednesday 2013-06-26 13:07:05 +0500, Taha Ansari ha scritto:

> Hi everyone!
>
> >>It should be exactly the same as FLTP to S16. I'm not willing to play
> >>guess games. Please provide source code you use for convert audio if you
> >>expect someone will help you.
>
> @Paul:
>
> My code is similar to what I have been nagging since few days on the
> mailing list, so sorry for not providing it (attached now near end of
> email)!
>
> While experimenting, I have made significant progress: now, my conversions
> from S16P to S16 are apparently fine (I was making a grave mistake while
> calling swr_convert() function). So current status is: I am able to convert
> between these two formats, but! only about 90% of audio is converted, and
> any media player just jumps ahead after reaching this mark.
>
> Kindly review my updated code (note: I am also writing delayed frames in
> end, but that does not help):
[...]

> int decode_packet()
> {
>     int rv = 0;
>     int ret = 0;
>
>     //audio stream?
>     if(input_packet.stream_index == audio_stream_index)
>     {
>         /* decode audio frame */
>         rv = avcodec_decode_audio4(codec_ctx_audio, decoded_frame,
> &got_frame, &input_packet);
>         if (rv < 0)
>         {
>             fprintf(stderr, "Error decoding audio frame\n");
>             //return ret;
>         }
>         else
>         {
>             if (got_frame)
>             {
>                 if ( audio_dst_data[0] == NULL )
>                 {
>                      ret = av_samples_alloc(audio_dst_data,
> &audio_dst_linesize, decoded_frame->channels,
>                         decoded_frame->nb_samples,
> (AVSampleFormat)decoded_frame->format, 1);
>                     if (ret < 0)
>                     {
>                         fprintf(stderr, "Could not allocate audio
> buffer\n");
>                         return AVERROR(ENOMEM);
>                     }
>                     /* TODO: extend return code of the av_samples_*
> functions so that this call is not needed */
>                     audio_dst_bufsize = av_samples_get_buffer_size(NULL,
> audio_st->codec->channels,
>                         decoded_frame->nb_samples,
> (AVSampleFormat)decoded_frame->format, 1);
>
>                     //int16_t* outputBuffer = ...;
>                     swr_convert( swr, audio_dst_data,
> out_frame->nb_samples, (const uint8_t**) decoded_frame->extended_data,
> decoded_frame->nb_samples );
>                 }

You may need to compute the output number of samples, indeed the
resampler can not guarantee that the number of samples requested will
be returned due to caching or missing input data (e.g. when
downsampling). Check the logic in resampling_audio.c, and check the
swr_convert() return value in order to understand how many samples
have been converted in your buffer.

A more high-level conversion function may help to simplify the code.

[...]
_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user
Reply | Threaded
Open this post in threaded view
|

Re: Audio conversion from planar to non-planar formats

Taha Ansari
You may need to compute the output number of samples, indeed the
resampler can not guarantee that the number of samples requested will
be returned due to caching or missing input data (e.g. when
downsampling). Check the logic in resampling_audio.c, and check the
swr_convert() return value in order to understand how many samples
have been converted in your buffer.

A more high-level conversion function may help to simplify the code.

Hi Stefano,

You are right, because I was actually downsampling (1152 bytes per frame to 1024), swr was holding back internal buffers, somehow. In resampling_audio.c example, I can see repeated calls to swr_convert function... I tried something different: flushing all swr buffers at the end of conversion process, and it worked just fine.

The only downside is, it adds up to RAM - for smaller files, it is no problem, but for large files (just for example 05 hours file), it will be something like:

1152 -1024= 128 bytes per frame
128 * 60 = 7680 bytes per second (approx)
7680 * 60 = 460800 bytes per minute
460800 * 60 = 27648000 bytes per hour
27648000 * 5 = 138240000 bytes for five hour recording
= 131.83 MB added up inside RAM

And that is, assuming I have 60 audio frames per second (I'm sure this figure is not correct, because I need to evaluate how sampling rate effects above formula, as well).

So although I got it to work somehow, I am not really sure if this is the most elegant way of doing it. What do you advice?

p.s. During test process I tried using

av_rescale_rnd(swr_get_delay(swr_ctx, src_rate) +
    src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);

function, but noticed artifacts in destination audio file: plays faster, data seems to be 'packed' inside each audio frame, giving me audible artifacts in destination mp4.

_______________________________________________
Libav-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/libav-user