
From tterribe@xiph.org  Thu Aug  1 08:19:17 2013
Return-Path: <tterribe@xiph.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B8EDF21E8153 for <codec@ietfa.amsl.com>; Thu,  1 Aug 2013 08:19:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.677
X-Spam-Level: 
X-Spam-Status: No, score=-2.677 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dco7pcdeW0UM for <codec@ietfa.amsl.com>; Thu,  1 Aug 2013 08:19:13 -0700 (PDT)
Received: from smtp.mozilla.org (mx1.corp.phx1.mozilla.com [63.245.216.69]) by ietfa.amsl.com (Postfix) with ESMTP id F222F21E8152 for <codec@ietf.org>; Thu,  1 Aug 2013 08:18:59 -0700 (PDT)
Received: from [130.129.33.240] (dhcp-21f0.meeting.ietf.org [130.129.33.240]) (Authenticated sender: tterriberry@mozilla.com) by mx1.mail.corp.phx1.mozilla.com (Postfix) with ESMTPSA id E397CF22DD for <codec@ietf.org>; Thu,  1 Aug 2013 08:18:58 -0700 (PDT)
Message-ID: <51FA7C61.3080704@xiph.org>
Date: Thu, 01 Aug 2013 08:18:57 -0700
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20100101 SeaMonkey/2.16
MIME-Version: 1.0
To: "codec@ietf.org" <codec@ietf.org>
References: <20130712222833.13138.87216.idtracker@ietfa.amsl.com> <51E0BDE1.9020901@jmvalin.ca>
In-Reply-To: <51E0BDE1.9020901@jmvalin.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [codec] Fwd: New Version Notification for draft-valin-codec-opus-update-00.txt
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Aug 2013 15:19:17 -0000

Jean-Marc Valin wrote:
> See these proposed fixes to the Opus RFC. These are minor changes to the
> normative part of the reference implementation.

With my chair hat on:

Although JDR originally said we would handle updates to RFC 6716 through 
the normal errata process [1], after conferring with our ADs, we decided 
to do them via a new draft, so that there was no question about 
overriding working group consensus.

To that end, we'd like to run a consensus call to add a milestone for 
doing these updates. It would be helpful to have reviews of this 
document before doing that call. I propose we allow a month (until Sep. 
1) for such reviews, and then run the call.


[1] 
https://datatracker.ietf.org/documents/LIAISON/liaison-2012-09-21-codec-isoiec-jtc-1sc-29wg-11-liaison-from-ietf-codec-working-group-to-isoiec-regarding-speech-and-audio-coding-standardization-attachment-1.pdf

From markh.sj@gmail.com  Wed Aug 14 17:37:13 2013
Return-Path: <markh.sj@gmail.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E481D21E80FA for <codec@ietfa.amsl.com>; Wed, 14 Aug 2013 17:37:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9O41+ToucO-b for <codec@ietfa.amsl.com>; Wed, 14 Aug 2013 17:37:11 -0700 (PDT)
Received: from mail-we0-x244.google.com (mail-we0-x244.google.com [IPv6:2a00:1450:400c:c03::244]) by ietfa.amsl.com (Postfix) with ESMTP id B29C221E80F9 for <codec@ietf.org>; Wed, 14 Aug 2013 17:37:10 -0700 (PDT)
Received: by mail-we0-f196.google.com with SMTP id p61so38821wes.3 for <codec@ietf.org>; Wed, 14 Aug 2013 17:37:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=KhIpcnM5mH1yCwEfmfakDQEepjJoxIJGNCtDBtRXm4s=; b=Cs+qkBAjioWrNcIS92tmazlU/gX0rEwKy3mfzRWp91ZcsQpea9nUUFSngTQMgkwvxz GohojYoxdBQA7BLwjvq5qAEyjkROLHTc6YkJTr/Tm+hUpgZidnjD+On2p4LBYq2T+A3X IwVweSHab00mHvs/j6v+py77Pksn1hv4glroRJmMmC+tdToLzUOgq3ck1wQSOAdlBkr2 O+FYQJ66gbWevulRCZr3vDMVbSt8FdaBQhaonWeqFLa4EyzfkgHJP3NqaSmx6W1ynFbf mXCph7onPdiKfqMi9NnsYTbrwQ37yg6mKPZpv8uWmv3JiKGE13rwzoS1jz2GF8MzGMGW 2wmA==
MIME-Version: 1.0
X-Received: by 10.180.98.3 with SMTP id ee3mr145777wib.48.1376527029870; Wed, 14 Aug 2013 17:37:09 -0700 (PDT)
Sender: markh.sj@gmail.com
Received: by 10.194.76.234 with HTTP; Wed, 14 Aug 2013 17:37:09 -0700 (PDT)
Date: Wed, 14 Aug 2013 17:37:09 -0700
X-Google-Sender-Auth: N-HS9y5-7y9ykl9ZECtumCrjovU
Message-ID: <CAMdZqKEDk4rJeEWr-0-oxHQDiy+Lk5QQei9-b+yrXLSRYs8GhQ@mail.gmail.com>
From: Mark Harris <mark.hsj@gmail.com>
To: codec@ietf.org
Content-Type: text/plain; charset=UTF-8
Subject: [codec] Ogg Opus zero-length frames
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Aug 2013 00:41:46 -0000

According to draft-ietf-codec-oggopus-01 section 4, Opus frames
with length 0 can be used to fill in gaps when writing Opus to Ogg,
since no gaps are allowed.  That makes sense, and is useful for
representing frames that were lost, corrupted, or not transmitted
due to DTX.  However those frames need to go in packets, which
require a TOC.  That means that in order to represent packets that
are not available, the muxer must synthesize a TOC.

It would be nice to have guidelines for encoding these gaps in the
Ogg Opus draft, including:

  * Clarification that zero-length frames should be written, with
    a synthesized TOC if necessary, in the case of missing frames
    (e.g. due to packet loss or corruption).  Currently section 4
    only mentions these frames in the context of an encoder with
    DTX enabled.

  * Clarification as to whether the LP/Hybrid/MDCT mode and stereo
    bit matter for zero-length frames, and if so then a recommended
    method of choosing these when it is necessary to synthesize a
    TOC.  This may be influenced by the duration of the gap, since
    not all valid gap durations are possible to encode in LP mode.

  * Clarification as to whether the number and duration of the
    individual frames matter, as long as the total duration is
    correct, or a recommended method of choosing frame durations.
    For example it takes fewer bytes to encode a 95ms gap as 19 5ms
    frames (using a single code 3 CBR packet), but if the previous
    packet was LP mode it may be better to use multiple packets
    with longer frames in order to remain in LP mode as long as
    possible.  There may also be a benefit to keeping the previous
    frame size as long as possible.

  * Clarification as to whether zero-length frames may be used in
    code 0 or code 1 packets, since RFC 6716 section 3.2.1 is
    referenced which only mentions their use in code 2 and code 3
    packets.

  * Clarification as to whether it is acceptable to combine
    synthesized zero-length frames with other longer frames in the
    same code 2 or code 3 packet, when the TOC matches.

  * Recommended method of recording gaps that are not a multiple of
    2.5ms.

--

From tterribe@xiph.org  Wed Aug 21 18:00:17 2013
Return-Path: <tterribe@xiph.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8C42711E8193 for <codec@ietfa.amsl.com>; Wed, 21 Aug 2013 18:00:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.677
X-Spam-Level: 
X-Spam-Status: No, score=-2.677 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5rvw-Io4MTQW for <codec@ietfa.amsl.com>; Wed, 21 Aug 2013 18:00:11 -0700 (PDT)
Received: from smtp.mozilla.org (mx2.corp.phx1.mozilla.com [63.245.216.70]) by ietfa.amsl.com (Postfix) with ESMTP id ECF3311E810B for <codec@ietf.org>; Wed, 21 Aug 2013 18:00:10 -0700 (PDT)
Received: from [10.250.6.54] (corp-240.mv.mozilla.com [63.245.220.240]) (Authenticated sender: tterriberry@mozilla.com) by mx2.mail.corp.phx1.mozilla.com (Postfix) with ESMTPSA id 4CD30F225D;  Wed, 21 Aug 2013 18:00:09 -0700 (PDT)
Message-ID: <52156299.6080906@xiph.org>
Date: Wed, 21 Aug 2013 18:00:09 -0700
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20100101 SeaMonkey/2.16.2
MIME-Version: 1.0
To: Mark Harris <mark.hsj@gmail.com>, codec@ietf.org
References: <CAMdZqKEDk4rJeEWr-0-oxHQDiy+Lk5QQei9-b+yrXLSRYs8GhQ@mail.gmail.com>
In-Reply-To: <CAMdZqKEDk4rJeEWr-0-oxHQDiy+Lk5QQei9-b+yrXLSRYs8GhQ@mail.gmail.com>
Content-Type: multipart/mixed; boundary="------------030106010700090708090409"
Subject: Re: [codec] Ogg Opus zero-length frames
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Aug 2013 01:00:17 -0000

This is a multi-part message in MIME format.
--------------030106010700090708090409
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Mark Harris wrote:
> It would be nice to have guidelines for encoding these gaps in the
> Ogg Opus draft, including:

I took a stab at drafting some text that answers these questions (XML 
diff attached):

4.1.  Repairing Gaps in Real-time Streams

    In order to support capturing a real-time stream that has lost
    packets, or that uses discontinuous transmission (DTX), a muxer
    SHOULD emit packets that explicitly request the use of Packet Loss
    Concealment (PLC) in place of the packets that were not transmitted.
    Only gaps that are a multiple of 2.5 ms are repairable, as these are
    the only durations that can be created by packet loss or DTX.  Muxers
    need not handle other gap sizes.  Creating the necessary packets
    involves synthesizing a TOC byte (defined in Section 3.1
    of [RFC6716])---and whatever additional internal framing is needed---
    to indicate the packet duration for each stream.  The actual length
    of each missing Opus frame inside the packet is zero bytes, as
    defined in Section 3.2.1 of [RFC6716].

    [RFC6716] does not impose any requirements on the PLC, but this
    section outlines choices that are expected to have a positive
    influence on most PLC implementations, including the reference
    implementation.  When possible, creating the TOC byte using the same
    mode, audio bandwidth, channel count, and frame size as the previous
    packet (if any) covers all losses that do not include a configuration
    switch, as defined in Section 4.5 of [RFC6716].  This is the simplest
    and usually the most well-tested case for the PLC to handle.  If
    there is no previous packet, reasonable decoders will not emit
    anything other than silence regardless of the mode.  Using the CELT-
    only mode for this case (with any audio bandwidth) allows maximum
    flexibility, since a single packet can represent any duration up to
    120 ms that is a multiple of 2.5 ms using at most two bytes.

    When a previous packet is available, keeping the audio bandwidth and
    channel count the same allows the PLC to provide maximum continuity
    in the concealment data it generates.  However, if the size of the
    gap is not a multiple of the most recent frame size, then the frame
    size will have to change for at least some frames.  Delaying such
    changes as long as possible to simplifies things for PLC
    implementations.  A 95 ms gap could be encoded as 19 5 ms frames in
    two bytes with a single CBR code 3 packet.  If the previous frame
    size was 20 ms, using four 80 ms frames, followed by three 5 ms
    frames requires 4 bytes (plus an extra byte of Ogg lacing overhead),
    but allows the PLC to use its well-tested steady state behavior for
    as long as possible.  The total bitrate of the latter approach,
    including Ogg overhead, is about 0.4 kbps, so the impact on file size
    is minimal.

    Changing modes is discouraged, since this causes some decoder
    implementations to reset their PLC state.  However, SILK and Hybrid
    modes cannot fill gaps that are not a multiple of 10 ms.  If
    switching to CELT mode is needed to match the gap size, doing so at
    the end of the gap allows the PLC to function for as long as
    possible.  Since CELT does not support medium-band audio, using
    wideband when switching from medium-band SILK ensures that any PLC
    implementation that does try to migrate state between the modes will
    not be forced to artificially reduce the bandwidth.

    The synthetic TOC byte MAY use any of codes 0, 1, 2, or 3 to pack the
    frame(s) into a packet.  If the TOC configuration matches, the muxer
    MAY combine the empty frames with previous or subsequent non-zero-
    length frames (using code 2 or VBR code 3).

It's a little bit long, but if people think it's useful (or have 
suggestions for shortening it), we should include it.

--------------030106010700090708090409
Content-Type: text/x-patch;
 name="zero-length-frames.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="zero-length-frames.diff"

diff --git a/doc/draft-ietf-codec-oggopus.xml b/doc/draft-ietf-codec-oggopus.xml
index 6131e69..d7cca9f 100644
--- a/doc/draft-ietf-codec-oggopus.xml
+++ b/doc/draft-ietf-codec-oggopus.xml
@@ -240,23 +240,91 @@ The one exception is the last page in the stream, as described below.
 <t>
 All other pages with completed packets after the first MUST have a granule
  position equal to the number of samples contained in packets that complete on
  that page plus the granule position of the most recent page with completed
  packets.
 This guarantees that a demuxer can assign individual packets the same granule
  position when working forwards as when working backwards.
 For this to work, there cannot be any gaps.
-In order to support capturing a stream that uses discontinuous transmission
- (DTX), an encoder SHOULD emit packets that explicitly request the use of
- Packet Loss Concealment (PLC) (i.e., with a frame length of 0, as defined in
- Section 3.2.1 of <xref target="RFC6716"/>) in place of the packets that were
- not transmitted.
 </t>
 
+<section anchor="gap-repair" title="Repairing Gaps in Real-time Streams">
+<t>
+In order to support capturing a real-time stream that has lost packets, or that
+ uses discontinuous transmission (DTX), a muxer SHOULD emit packets that
+ explicitly request the use of Packet Loss Concealment (PLC) in place of the
+ packets that were not transmitted.
+Only gaps that are a multiple of 2.5&nbsp;ms are repairable, as these are the
+ only durations that can be created by packet loss or DTX.
+Muxers need not handle other gap sizes.
+Creating the necessary packets involves synthesizing a TOC byte (defined in
+ Section&nbsp;3.1 of&nbsp;<xref target="RFC6716"/>)---and whatever additional
+ internal framing is needed---to indicate the packet duration for each stream.
+The actual length of each missing Opus frame inside the packet is zero bytes,
+ as defined in Section&nbsp;3.2.1 of&nbsp;<xref target="RFC6716"/>.
+</t>
+
+<t>
+<xref target="RFC6716"/> does not impose any requirements on the PLC, but this
+ section outlines choices that are expected to have a positive influence on
+ most PLC implementations, including the reference implementation.
+When possible, creating the TOC byte using the same mode, audio bandwidth,
+ channel count, and frame size as the previous packet (if any) covers all
+ losses that do not include a configuration switch, as defined in
+ Section&nbsp;4.5 of&nbsp;<xref target="RFC6716"/>.
+This is the simplest and usually the most well-tested case for the PLC to
+ handle.
+If there is no previous packet, reasonable decoders will not emit anything
+ other than silence regardless of the mode.
+Using the CELT-only mode for this case (with any audio bandwidth) allows
+ maximum flexibility, since a single packet can represent any duration up to
+ 120&nbsp;ms that is a multiple of 2.5&nbsp;ms using at most two bytes.
+</t>
+
+<t>
+When a previous packet is available, keeping the audio bandwidth and channel
+ count the same allows the PLC to provide maximum continuity in the concealment
+ data it generates.
+However, if the size of the gap is not a multiple of the most recent frame
+ size, then the frame size will have to change for at least some frames.
+Delaying such changes as long as possible to simplifies things for PLC
+ implementations.
+A 95&nbsp;ms gap could be encoded as 19 5&nbsp;ms frames in two bytes
+ with a single CBR code&nbsp;3 packet.
+If the previous frame size was 20&nbsp;ms, using four 80&nbsp;ms frames,
+ followed by three 5&nbsp;ms frames requires 4&nbsp;bytes (plus an extra byte
+ of Ogg lacing overhead), but allows the PLC to use its well-tested steady
+ state behavior for as long as possible.
+The total bitrate of the latter approach, including Ogg overhead, is about
+ 0.4&nbsp;kbps, so the impact on file size is minimal.
+</t>
+
+<t>
+Changing modes is discouraged, since this causes some decoder implementations
+ to reset their PLC state.
+However, SILK and Hybrid modes cannot fill gaps that are not a multiple of
+ 10&nbsp;ms.
+If switching to CELT mode is needed to match the gap size, doing so at the end
+ of the gap allows the PLC to function for as long as possible.
+Since CELT does not support medium-band audio, using wideband when switching
+ from medium-band SILK ensures that any PLC implementation that does try to
+ migrate state between the modes will not be forced to artificially reduce the
+ bandwidth.
+</t>
+
+<t>
+The synthetic TOC byte MAY use any of codes&nbsp;0, 1, 2, or&nbsp;3 to pack the
+ frame(s) into a packet.
+If the TOC configuration matches, the muxer MAY combine the empty frames with
+ previous or subsequent non-zero-length frames (using code&nbsp;2 or
+ VBR code&nbsp;3).
+</t>
+</section>
+
 <section anchor="preskip" title="Pre-skip">
 <t>
 There is some amount of latency introduced during the decoding process, to
  allow for overlap in the MDCT modes, stereo mixing in the LP modes, and
  resampling, and the encoder will introduce even more latency (though the exact
  amount is not specified).
 Therefore, the first few samples produced by the decoder do not correspond to
  real input audio, but are instead composed of padding inserted by the encoder

--------------030106010700090708090409--

From markh.sj@gmail.com  Fri Aug 23 03:56:24 2013
Return-Path: <markh.sj@gmail.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E705D11E81C6 for <codec@ietfa.amsl.com>; Fri, 23 Aug 2013 03:56:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NP-NtIJiN4ki for <codec@ietfa.amsl.com>; Fri, 23 Aug 2013 03:56:24 -0700 (PDT)
Received: from mail-wi0-x231.google.com (mail-wi0-x231.google.com [IPv6:2a00:1450:400c:c05::231]) by ietfa.amsl.com (Postfix) with ESMTP id 2FDDA11E819B for <codec@ietf.org>; Fri, 23 Aug 2013 03:56:24 -0700 (PDT)
Received: by mail-wi0-f177.google.com with SMTP id hq12so405748wib.10 for <codec@ietf.org>; Fri, 23 Aug 2013 03:56:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=ktGJKVBQ47ofOEhBv4GfPFwvtQE4hPfoDkT84U/k+fg=; b=ULw2IT3HHextXFfzRtFJD4NHwF5VC2gqYbsedI5OUSvLavFEYODfx6tEtijvAs8Kp6 Nt4DEVSSS4xFGIOQWd+uPWl8uNv9uiQjI1XkthKl5hSADB0Tut+mdTAuqarsEY7zBG8y aHnjNuJVnfk2ISGpWmte7R3tgXs4zcql3WT1TmDZjS8LfLdaV7J46RjYEawOBHImbJSK oS4ZtXQtRaq1Uk9oA0PMdlKrY+c8Ix0/FuBLKWeTMPRDLOaMzvsfmBFXm32KkdNO3XhR imr98+qbgvo+PLkqlX72nMbP9DmRVtWMUcpRa1+0wAy4lfwpNUQ/4BKvCGq5vqLdewwy Z8Ug==
MIME-Version: 1.0
X-Received: by 10.194.19.5 with SMTP id a5mr770173wje.48.1377255383351; Fri, 23 Aug 2013 03:56:23 -0700 (PDT)
Sender: markh.sj@gmail.com
Received: by 10.194.94.9 with HTTP; Fri, 23 Aug 2013 03:56:23 -0700 (PDT)
In-Reply-To: <52156299.6080906@xiph.org>
References: <CAMdZqKEDk4rJeEWr-0-oxHQDiy+Lk5QQei9-b+yrXLSRYs8GhQ@mail.gmail.com> <52156299.6080906@xiph.org>
Date: Fri, 23 Aug 2013 03:56:23 -0700
X-Google-Sender-Auth: Z0J-qqkqZQTfjXxempyGu2rsZJM
Message-ID: <CAMdZqKHq-03JfRhtC-EOUzcmBdW4uQK5BUejjdF1=OvamoiLhQ@mail.gmail.com>
From: Mark Harris <mark.hsj@gmail.com>
To: "Timothy B. Terriberry" <tterribe@xiph.org>
Content-Type: text/plain; charset=UTF-8
Cc: codec@ietf.org
Subject: Re: [codec] Ogg Opus zero-length frames
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Aug 2013 10:56:25 -0000

Timothy B. Terriberry <tterribe@xiph.org> wrote:
> I took a stab at drafting some text that answers these questions (XML diff
> attached):

Thanks; this is very helpful.  I have just a few comments:


> In order to support capturing a real-time stream that has lost
> packets, or that uses discontinuous transmission (DTX), a muxer
> SHOULD emit packets that explicitly request the use of Packet Loss
> Concealment (PLC) in place of the packets that were not transmitted.

lost or not transmitted.


> If
> there is no previous packet, reasonable decoders will not emit
> anything other than silence regardless of the mode.  Using the CELT-
> only mode for this case (with any audio bandwidth) allows maximum
> flexibility, since a single packet can represent any duration up to
> 120 ms that is a multiple of 2.5 ms using at most two bytes.

...plus one byte of Ogg lacing.

For initial zero-length frames, might it be better to prefer the
configuration of the first non-zero-length frame to the extent
possible, when available, to help in any situation where the
configuration of the first packet might be used to report
information (such as frame size), or for an initial estimate of
bandwidth, required buffer sizes, etc.?

Or perhaps the last sentence should just be omitted, since it
already effectively says that the mode, bandwidth, and channel
count are unlikely to matter to a decoder in this case.


> Delaying such
> changes as long as possible to simplifies things for PLC
> implementations.

s/to //


> A 95 ms gap could be encoded as 19 5 ms frames in
> two bytes with a single CBR code 3 packet.  If the previous frame
> size was 20 ms, using four 80 ms frames, followed by three 5 ms

s/80/20/


> frames requires 4 bytes (plus an extra byte of Ogg lacing overhead),
> but allows the PLC to use its well-tested steady state behavior for
> as long as possible.

To clarify, if the previous frame was 20 ms SILK, is this
suggesting a 4 x 20 ms SILK packet followed by a 3 x 5 ms CELT
packet?  The next paragraph suggests keeping the mode as long as
possible, implying that it may be better to use 4 x 20 ms SILK +
10 ms SILK + 5 ms CELT.  Or is minimizing the number of frame size
changes more important than keeping the mode as long as possible?


Thanks.

From tterribe@xiph.org  Fri Aug 23 06:51:40 2013
Return-Path: <tterribe@xiph.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B4CB511E817A for <codec@ietfa.amsl.com>; Fri, 23 Aug 2013 06:51:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.677
X-Spam-Level: 
X-Spam-Status: No, score=-2.677 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0X3Ad-Bj0U-j for <codec@ietfa.amsl.com>; Fri, 23 Aug 2013 06:51:33 -0700 (PDT)
Received: from smtp.mozilla.org (mx2.corp.phx1.mozilla.com [63.245.216.70]) by ietfa.amsl.com (Postfix) with ESMTP id 2F1DC11E80FF for <codec@ietf.org>; Fri, 23 Aug 2013 06:51:33 -0700 (PDT)
Received: from [172.17.0.5] (50-78-100-113-static.hfc.comcastbusiness.net [50.78.100.113]) (Authenticated sender: tterriberry@mozilla.com) by mx2.mail.corp.phx1.mozilla.com (Postfix) with ESMTPSA id A8F65F2181 for <codec@ietf.org>; Fri, 23 Aug 2013 06:51:31 -0700 (PDT)
Message-ID: <521768E3.5030502@xiph.org>
Date: Fri, 23 Aug 2013 06:51:31 -0700
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20100101 SeaMonkey/2.16
MIME-Version: 1.0
To: codec@ietf.org
References: <CAMdZqKEDk4rJeEWr-0-oxHQDiy+Lk5QQei9-b+yrXLSRYs8GhQ@mail.gmail.com> <52156299.6080906@xiph.org> <CAMdZqKHq-03JfRhtC-EOUzcmBdW4uQK5BUejjdF1=OvamoiLhQ@mail.gmail.com>
In-Reply-To: <CAMdZqKHq-03JfRhtC-EOUzcmBdW4uQK5BUejjdF1=OvamoiLhQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [codec] Ogg Opus zero-length frames
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Aug 2013 13:51:40 -0000

Mark Harris wrote:
> lost or not transmitted.

Done.

> ...plus one byte of Ogg lacing.

See below.

> For initial zero-length frames, might it be better to prefer the
> configuration of the first non-zero-length frame to the extent
> possible, when available, to help in any situation where the
> configuration of the first packet might be used to report
> information (such as frame size), or for an initial estimate of
> bandwidth, required buffer sizes, etc.?

I thought about this a little when writing the original text, and I'm 
not sure it helps much. It's not a bad thing to do, both here and when 
you're forced to change modes to match the gap duration, but the 
benefits are small (you can't really do a good job estimating initial 
bandwidth/buffer sizes from 0-byte frames, for example), and it means 
you can't write out anything until you get the first packet after the 
gap. Most RTP stacks, on the other hand, are going to declare packets 
lost and generate PLC without necessarily waiting for that to arrive.

> Or perhaps the last sentence should just be omitted, since it
> already effectively says that the mode, bandwidth, and channel
> count are unlikely to matter to a decoder in this case.

In practice I only expect initial gaps to be useful in conjunction with 
other streams (e.g., video), since otherwise you would just take the 
arrival of the first Opus packet as the start of the stream, so perhaps 
it isn't worth spending much text on them. If you don't think this is 
necessary, let's just drop it.

> s/to //

Done.

> s/80/20/

Done.

>> frames requires 4 bytes (plus an extra byte of Ogg lacing overhead),
>> but allows the PLC to use its well-tested steady state behavior for
>> as long as possible.
>
> To clarify, if the previous frame was 20 ms SILK, is this
> suggesting a 4 x 20 ms SILK packet followed by a 3 x 5 ms CELT
> packet?  The next paragraph suggests keeping the mode as long as
> possible, implying that it may be better to use 4 x 20 ms SILK +
> 10 ms SILK + 5 ms CELT.  Or is minimizing the number of frame size
> changes more important than keeping the mode as long as possible?

No, the other way around. I changed this example a few times and didn't 
think through the last version very well, apparently. 4x20 ms SILK + 
10ms SILK + 5 ms CELT would be better. I'll rework this to clarify.

Thanks for the review!

From jmvalin@jmvalin.ca  Fri Aug 23 21:11:35 2013
Return-Path: <jmvalin@jmvalin.ca>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 95A6721F9E3F for <codec@ietfa.amsl.com>; Fri, 23 Aug 2013 21:11:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.677
X-Spam-Level: 
X-Spam-Status: No, score=-2.677 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1iwJ+BsfcPrn for <codec@ietfa.amsl.com>; Fri, 23 Aug 2013 21:11:29 -0700 (PDT)
Received: from smtp.mozilla.org (mx1.corp.phx1.mozilla.com [63.245.216.69]) by ietfa.amsl.com (Postfix) with ESMTP id 52C2C21F9E45 for <codec@ietf.org>; Fri, 23 Aug 2013 21:11:25 -0700 (PDT)
Received: from [192.168.1.15] (modemcable130.97-201-24.mc.videotron.ca [24.201.97.130]) (Authenticated sender: jvalin@mozilla.com) by mx1.mail.corp.phx1.mozilla.com (Postfix) with ESMTPSA id 36341F202C;  Fri, 23 Aug 2013 21:11:24 -0700 (PDT)
Message-ID: <5218326B.4010503@jmvalin.ca>
Date: Sat, 24 Aug 2013 00:11:23 -0400
From: Jean-Marc Valin <jmvalin@jmvalin.ca>
User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
MIME-Version: 1.0
To: "Timothy B. Terriberry" <tterribe@xiph.org>
References: <CAMdZqKEDk4rJeEWr-0-oxHQDiy+Lk5QQei9-b+yrXLSRYs8GhQ@mail.gmail.com> <52156299.6080906@xiph.org>
In-Reply-To: <52156299.6080906@xiph.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: codec@ietf.org
Subject: Re: [codec] Ogg Opus zero-length frames
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 24 Aug 2013 04:11:35 -0000

I'm fine with the proposed text in general. I only have two comments below:

On 08/21/2013 09:00 PM, Timothy B. Terriberry wrote:
>    When possible, creating the TOC byte using the same
>    mode, audio bandwidth, channel count, and frame size as the previous
>    packet (if any) covers all losses that do not include a configuration
>    switch, as defined in Section 4.5 of [RFC6716].

Any way you can make that sentence easier to parse?

>    If
>    there is no previous packet, reasonable decoders will not emit
>    anything other than silence regardless of the mode.  Using the CELT-
>    only mode for this case (with any audio bandwidth) allows maximum
>    flexibility, since a single packet can represent any duration up to
>    120 ms that is a multiple of 2.5 ms using at most two bytes.

I think both these sentences should go since they add more confusion
than they help.

	Jean-Marc

From markh.sj@gmail.com  Thu Aug 29 16:21:46 2013
Return-Path: <markh.sj@gmail.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0DBC911E818B for <codec@ietfa.amsl.com>; Thu, 29 Aug 2013 16:21:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cq0ROR4s+c67 for <codec@ietfa.amsl.com>; Thu, 29 Aug 2013 16:21:45 -0700 (PDT)
Received: from mail-qa0-x233.google.com (mail-qa0-x233.google.com [IPv6:2607:f8b0:400d:c00::233]) by ietfa.amsl.com (Postfix) with ESMTP id 748B311E817D for <codec@ietf.org>; Thu, 29 Aug 2013 16:21:45 -0700 (PDT)
Received: by mail-qa0-f51.google.com with SMTP id bv4so800040qab.17 for <codec@ietf.org>; Thu, 29 Aug 2013 16:21:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=M6IZH78xI41PDKvvJzaeQzIP9yQ/cn3YCG4Zk6sFrqo=; b=y4ePDbKJ1WdqYoZ8cOLTeTDFjpN3zxn4PVC4y1b7dh/qzxNaDRX68MdzpbBi96u1FP U7IeXNoHVfryGbZsKng1sQpfsNwfTKHpOfEcxc0E9/yEoGP1cV6L/lxarksf0fGB/SEU mvCkKlAi7YawqlZg/aT1QbH3CXH9eWrExcmneUA4+sOxrERFenm0/NdflyNAJM3/Npfj 5fqtEHxfJgWE1DQ7YnTp1nvpMcg0Td2dEPxVvgaXjQj+CHgDm1vBubglcIctCTtzXFz+ 47rZM8IeIJu9jmHWybJDbrldMD9UNGRxBe/o3IQZZwVxeR6LWCCCFdH0/RPSlJc6dIjg rf/g==
MIME-Version: 1.0
X-Received: by 10.224.12.146 with SMTP id x18mr51051qax.110.1377818504977; Thu, 29 Aug 2013 16:21:44 -0700 (PDT)
Sender: markh.sj@gmail.com
Received: by 10.49.101.40 with HTTP; Thu, 29 Aug 2013 16:21:44 -0700 (PDT)
Date: Thu, 29 Aug 2013 16:21:44 -0700
X-Google-Sender-Auth: Mrp48fABhgWz9pehiCqhUBhfJJ8
Message-ID: <CAMdZqKE6Z3NWN=igExUNZ8SaQ=xeR468-Mdxj1DObD9DhaztbQ@mail.gmail.com>
From: Mark Harris <mark.hsj@gmail.com>
To: codec@ietf.org
Content-Type: text/plain; charset=UTF-8
Subject: [codec] draft-valin-codec-opus-update-00 minor nits
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Aug 2013 23:21:46 -0000

Some minor formatting/grammar nits in draft-valin-codec-opus-update-00:


   This document address minor issues that were discovered in the

s/address/addresses/

   impulse (i.e.  single sample) in the decoded audio.  This can be

s/i.e.  /i.e. /

   This packet parsing issue is limited to reading memory up to about 60
   kB beyond the compressed buffer.  This can only be triggered by a

incorrect line break between quantity and its unit

   for RTP.  In theory, it _could_ crash a file decoder (e.g.  Opus in

s/e.g.  /e.g. /

   2.  Because the size was wrong, this potentially allowed the source
       and destination regions of the memcpy overlap.  We _believe_ that
       nSamplesIn is at least fs_in_khZ, which is at least 8.  Since
       RESAMPLER_ORDER_FIR_12 is only 8,that should not be a problem
       once the type size is fixed.

s/memcpy overlap/memcpy() to overlap/
s/8,that/8, that/

   The fact that the code never produced any error in testing (including
   when run under the Valgrind memory debugger), suggests that in
   practice the batch sizes are reasonable enough that none of the
   issues above was ever a problem.  However, proving that is non-
   obvious.

s/was/were/

   The code can be fixed by applying the following changes to like 70 of
   silk/resampler_private_IIR_FIR.c:

s/like/line/

   The last issue is not strictly a bug, but it is an issue that has
   been reported when downmixing Opus decoded stream to mono, whether
   this is done inside the decoder or as a post-processing on the stereo
   decoder output.  Opus intensity stereo allows optionally coding the

s/downmixing/downmixing an/
s/post-processing/post-processing step/

some source lines exceed 72 columns (rfc max)
