[sip-comm-dev] GSoc 09 : Dtmf without Transformer but with BufferTransferHandler

Emil Ivov emcho at sip-communicator.org
Fri Jul 24 19:56:03 CEST 2009

Hey all,

We looked into this some more today and decided to change design once
more (hopefully for the last time).

First of all it appears that it's rather tricky to control the RTP
timestamp for event/dtmf packets when they are generated
PushBufferStream that also handles audio. JMF seems to be calculating it
(the timestamp) by itself, based on format properties such as the sample
rate for example (thanks to Lubomir for clarifying this). This basically
means that obtaining correct timestamps would take a considerable amount
of tweaking (unless of course we are missing something).

The reason we first decided to try PushBufferStreams, shared between
audio and dtmf, was the fact that they allow us to inject the DTMF
events in between audio packets and let JMF handle the RTP seqnum-s for

The situation would be a lot simpler if, throughout the duration of a
DTMF event, we simply replaced all audio packets with DTMF ones. This
would be equivalent to replacing your voice with a tone.

The implementation of this behaviour could be made by only intervening
on the Connector/Transformer level in a pretty straightforward way at that:

We are going to allow TranformOutputStream-s to use an _ordered_ list of
 Transformer-s. This way we'd be able to register our DTMF/event
transformer before the ZRTP one in cases where both are enabled. As soon
as we receive an indication that the user has pressed a tone-generating
key we are going to place the DTMF transformer in an active mode and it
is going to start overwriting the timestamps and the payloads of all
outgoing audio packets (without touching the seqnums). After receiving
an indication that the DTMF generating event has ended (i.e. user
stopped pressing the key), the DTMF transformer is going to overwrite
its three final packets as per RFC 4733 and move out of active mode. In
other words - dead simple.

The one clear inconvenience of this approach (as compared to the one
where we inject DTMF in between audio packets) is that we will be losing
all audio generated and sent while the DTMF key is pressed. I believe
this could be considered acceptable in for a comm client like SC.

We couldn't find any text in 4733, specifying a preference for one of
the approaches, nor anything that would deem one of them as
inappropriate. Besides, a few quick tests with Twinkle, Ekiga, and
X-Lite, showed this was also their default behaviour.

We should therefore be on safe ground and are thus switching back to the
use of the Connector.


Romain wrote:
> Hi
> We talked with Lubomir and Emil today about the DTMF conception, and
> principally about seeing the user as a PushDataSource.
> This is just a refactoring that make the code more readable.
> We will use the fact that a PushDataSource can Push data using the
> function transferData(streams) from the BufferTransferHandler of the
> dataSource streams.
> When transferData(streams) is called, JMF calls the function
> read(buffer) from the Stream.
> So in order to inject packet we just have to call
> transferData(streams), and fill correctly the buffer for the read
> function.
> In the previous design, we don't need anymore two PushBufferStreams
> nor the InjectStreamBufferTransferHandler. Only one PushBufferStreams
> is used to read the Sequence Number when the push come from the audio
> device, and to generate DTMF packets when the push come from our DTMF
> functions.
> We will delegate the BufferTransferHandler to the wrapped audio streams.
> Cheers
> Romain
> 2009/7/24 Romain <filirom1 at gmail.com>:
>> Hi
>> Just few new things :
>>  - now DTMF packets are sent every 50ms
>>  - there is a timeout if the user do not released the button
>> Still remains :
>>  - freezing the timestamp (and I really don't know you to do this
>> with my conception)
>>  - long duration event (easy)
>>  - do not send DTMF in Video Stream
>>  - Refactoring
>> 2009/7/23 Romain <filirom1 at gmail.com>:
>>> Hi
>>> If you don't want to read the long description I sum it up here :
>>>  - I don't know how to freeze the timestamp for the two last packets
>>>  - The SeqNum increments works correctly now when I inject DTMF packets
>>>  - Our DTMF implementation should adapt itself depending on the
>>> remote side (no support for DTMF on RTP, support for Payload Type 100,
>>> 101,...)
>>>  -
>>> 1 - timestamp :
>>> In RTPTransmiter
>>> public void TransmitPacket(Buffer b, SendSSRCInfo info)
>>> {
>>>        info.rtptime = info.getTimeStamp(b);
>>>        RTPPacket p = MakeRTPPacket(b, info);
>>>        ...
>>> }
>>> protected RTPPacket MakeRTPPacket(Buffer b, SendSSRCInfo info)
>>> {
>>>        ...
>>>        RTPPacket rtp = new RTPPacket(p);
>>>        rtp.timestamp = ((SSRCInfo) (info)).rtptime;
>>>        ...
>>> }
>>> So we will see what happens in info.getTimeStamp(b), because this is
>>> the timestamp value transmited on RTP
>>> public long getTimeStamp(Buffer b)
>>> {
>>>        if(b.getFormat() instanceof AudioFormat)
>>>        {
>>>                Log.comment("format "+b.getFormat());
>>>                if(mpegAudio.matches(b.getFormat()))
>>>                {
>>>                        Log.comment("match");
>>>                        if(b.getTimeStamp() >= 0L)
>>>                        {
>>>                                Log.comment(">0L");
>>>                                return (b.getTimeStamp() * 90L) / 0xf4240L;
>>>                        } else
>>>                        {
>>>                                return System.currentTimeMillis() * 90L;
>>>                        }
>>>                } else
>>>                {
>>>                        Log.comment("arg");   //  We come always here
>>>                        totalSamples += calculateSampleCount(b);
>>>                        return totalSamples;
>>>                }
>>>        }
>>>        if(b.getFormat() instanceof VideoFormat)
>>>        {
>>>                if(b.getTimeStamp() >= 0L)
>>>                {
>>>                        return (b.getTimeStamp() * 90L) / 0xf4240L;
>>>                } else
>>>                {
>>>                        return System.currentTimeMillis() * 90L;
>>>                }
>>>        } else
>>>        {
>>>                return b.getTimeStamp();
>>>        }
>>> }
>>> This is what happens each time info.getTimeStamp(b) is called for DTMF packet :
>>> totalSamples += calculateSampleCount(b);
>>> return totalSamples;
>>> calculateSampleCount(Buffer b)
>>> return -1
>>> or
>>> AudioFormat f = (AudioFormat)b.getFormat();
>>> long t = f.computeDuration(b.getLength());
>>> return (int)(((double)t * f.getSampleRate()) / 1000000000D);
>>> NO references to b.timestamp. The value returned by
>>> calculateSampleCount is constant because it depends only on b.Length
>>> and b.getFormat().
>>> If we want to manage the timestamp value, this test
>>> if(mpegAudio.matches(b.getFormat())) has to be true.
>>>  -> so the DTMF encoding must be AudioFormat f = new
>>> AudioFormat("mpegaudio/rtp");
>>>    But this is impossible because SC MediaUtil class do matching
>>> between Encodings and RTP Payload :
>>>        public static int jmfToSdpEncoding(String jmfEncoding)
>>>    {
>>>                ...
>>>                else if (jmfEncoding.equals(DtmfConstants.DtmfEncoding))
>>> //telephone-event/8000
>>>                {
>>>                        return DtmfConstants.DtmfSDP; // 101
>>>                }
>>>                ...
>>>        }
>>>        If we set DtmfEncoding to "mpegaudio/rtp", the others AudioFormat
>>> using mpegaudio/rtp will get the DTMF Payload.
>>> An other impossible way, is to pass this test : if(b.getFormat()
>>> instanceof VideoFormat) or the last else (an unknown Format)
>>>  -> But in JMF Audio and Video are not processed the same. When I
>>> tried to inject an VideoFormat into an AudioStream it breaked JMF.
>>> 2 - Sequence Number :
>>> In the previous mail I explained that the sequence number jump 50
>>> number each time it come back to audio packet :
>>> packet   /  SeqNum
>>> audio           100
>>> audio           101
>>> audio           102
>>> dtmf            103
>>> audio           185
>>> audio           186
>>> dtmf            187
>>> audio           256
>>> ....
>>> I traced what happens in JMF :
>>> When I create my DTMF packet, I set the sequence number of the buffer
>>> to zero b.setSequenceNumber(0);
>>> lastBufSeq save the sequence number of the buffer to test the next
>>> buffer Sequence Number like this : (seq - lastBufSeq > 1L)
>>> That means seq > lastBufSeq +1L, that means we can inject ONLY one
>>> packet between two audio packets if we set our DTMF Sequence Number =
>>> last audio Sequence Number.
>>>    public long getSequenceNumber(Buffer b)
>>>    {
>>>        long seq = b.getSequenceNumber(); // Here the Sequence Number
>>> of the buffer is read
>>>        if(lastSeq == -1L)
>>>        {
>>>            lastSeq = (long)((double)System.currentTimeMillis() *
>>> Math.random());
>>>            lastBufSeq = seq;
>>>                        return lastSeq;
>>>        }
>>>        if(seq - lastBufSeq > 1L) // Here we test the current buffer
>>> SeqNum ant the previous buffer SeqNum. We allow a difference of 1
>>> packet .
>>>        {
>>>            lastSeq += seq - lastBufSeq;
>>>        } else
>>>        {
>>>            lastSeq++;
>>>        }
>>>        lastBufSeq = seq; // Here we save the last Sequence Number of the buffer
>>>        return lastSeq;
>>>    }
>>> In order to make it works I need the audio Sequence Number. So I
>>> created a wrapper PushBufferStreams around the audioStreams which will
>>> delegate all its function to the wrapped instance. But, when the read
>>> function is called, it will give us access to the Sequence Number.
>>> 3 - Will not try to send DTMF packet if the remote side do not accept
>>> it in the SDP description.
>>> One good thing to do would be to test if the remote side accept DTMF
>>> on RTP, if not, transmit it via SIP INFO.
>>> 4 - Payload Type :
>>> DTMF payload type = 101 (RFC 2833) or 100 (RFC 4733).
>>> Actually I only test one payload type (in DtmfConstants class).
>>> We should adapt our Payload type depending on the remote side capabilities.
>>> 5 - PushBufferDataSource for the user.
>>> Lubomir I really don't know how to create a PushBufferDataSource for
>>> the user knowing that the DTMF streams need to be inside the audio
>>> Streams (same SSRC, SeqNum continue).
>>> Could you please write me more details of your idea. Thx.
>>> Cheers
>>> Romain
>>> 2009/7/23 Romain <filirom1 at gmail.com>:
>>>> Hi Lubomir
>>>> 2009/7/22 Lubomir Marinov <lubomir.marinov at gmail.com>:
>>>>> Hi Romain,
>>>>> Impressive skills, lack of communication.
>>>>> A great "thank you" to Emil for pointing it out to me that you deserve
>>>>> genuine congratulations for coming up with the idea on your own given
>>>>> that you're a student and it's your first project on JMF.
>>>>> Congratulations! For the guys on this development list who have
>>>>> followed our previous threads on the subject, I'd like to give the
>>>>> details that though I hinted at moving the center of the
>>>>> implementation idea in the area of codecs more than a week ago, Romain
>>>>> read my message just yesterday when he came online to submit his own
>>>>> idea and implementation.
>>>>> I fail to be amused though.
>>>>> I find it embarrassing to submit to waiting for an answer from my
>>>>> student for more than a week only to discover that my mentor message
>>>>> hasn't been read and has thus been rendered useless.
>>>> Sorry, but you sent your message on Saturday, and I saw it on Tuesday
>>>> morning. It is a short week.
>>>>> Especially when
>>>>> it comes after half a program of 15 hours per week (explicitly stated
>>>>> in the application form) just when the student states he's finally
>>>>> going to honor us with 40 hours per week.
>>>> As you said, it was explicit in my application form that during my
>>>> scholar period I can only give you 15 hours per week.
>>>> If this not bother you, could we please continue those arguments in
>>>> private mails.
>>>> Thx
>>>>> As to the new implementation we are being presented with, I find the
>>>>> idea correct and the design unfinished, thus the implementation is
>>>>> premature. Then, of course, the explicit stress on
>>>>> BufferTransferHandler as a packet injection means is inaccurate.
>>>> In 3 days, I presented you another way to inject packets.
>>>> This way has more advantages than the Transformer way.
>>>> Using this way, we can implement 99% of the RFC.
>>>> The last percent is : we can not freeze the timestamp  for the 2 last packets.
>>>> In my previous mail I pointed every problems I found of my
>>>> implementation, that mean it is not finished.
>>>> But of course you have more experience with JMF, I follow your advices.
>>>>> The major missing piece is that there are two sources of pushes, not
>>>>> one: the very capture device for the audio AND the user for DTMF.
>>>>> Currently, the implementation only pushes through the capture device
>>>>> so it's understandable that "If the user press and releas too fast,
>>>>> some DTMF packets are missing."
>>>> I try to figure out our idea of two data sources : one for dtmf and
>>>> one for the user.
>>>> I think we have a problem keeping the SSRC of the DTMF stream = SSRC
>>>> of the audio stream.
>>>> This is a temporary problem, that could be resolve quickly with my
>>>> implementation.
>>>>> Another weak point I find is the desire to explicitly put the
>>>>> injection in the transfer handler. For what it's worth, the
>>>>> implementation already takes over the DataSource and its streams (and,
>>>>> of course, it takes over the transfer handler which is necessary
>>>>> anyway in order to hide the wrapped stream) so I'd rather think the
>>>>> injection happens when reading from the stream, rather than when
>>>>> notifying that there's data to be read. If the very reading was taken
>>>>> over, I believe it would've been much easier to track the sequence
>>>>> number.
>>>> Now the sequence number is tracked.
>>>> But if you want to share your vision of your user DataSource, I will
>>>> have an internet access tomorow during the whole day.
>>>>>> Each time a DTMF packet is sent, the SeqNum of the next an audio
>>>>>> packet icrements very fast :
>>>>>> I don't think I could correct this because this behaviour is hard
>>>>>> coded in J MF :
>>>>>>    public long getSequenceNumber(Buffer b)
>>>>> I honestly don't see why this method is the problem. Please provide
>>>>> more details.
>>>> In the next mail.
>>>> I am sending it now.
>>>>> The next missing piece in the implementation is the fact that the
>>>>> injection should happen only in the stream signaled in the SDP, not
>>>>> all and certainly not the video streams.
>>>> DTMF need to be inject in an AudioStream.
>>>>   The RTP payload format for named telephone events is designated as
>>>>   "telephone-event", the media type as "audio/telephone-event".
>>>>> And just to mention it though I'm aware that the implementation is
>>>>> unfinished, we have to be careful in the wrapper DataSource when
>>>>> calling the wrapped DataSource's getStreams() in the constructor. Not
>>>>> only it seems technically incorrect to do it before calling connect()
>>>>> but also there's no technical guarantee that calling it later one
>>>>> would return the same number of streams.
>>>> Ok, I am quite new at JMF and I want to learn. My implementation is
>>>> not finished and I am aware of it.
>>>> Tomorrow I think I could give you my implementation with 99% of the
>>>> RFC implemented.
>>>> If you don't like this, and your implementation idea is better than
>>>> mine, I will do as you want.
>>>>> Regards,
>>>>> Lubomir
>>>> Romain
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe at sip-communicator.dev.java.net
>>>>> For additional commands, e-mail: dev-help at sip-communicator.dev.java.net
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe at sip-communicator.dev.java.net
> For additional commands, e-mail: dev-help at sip-communicator.dev.java.net

Emil Ivov, Ph.D.                               67000 Strasbourg,
Project Lead                                   France
SIP Communicator
emcho at sip-communicator.org                     PHONE: +
http://sip-communicator.org                    FAX:   +

To unsubscribe, e-mail: dev-unsubscribe at sip-communicator.dev.java.net
For additional commands, e-mail: dev-help at sip-communicator.dev.java.net

More information about the dev mailing list