
From nobody Mon May  4 19:45:28 2015
Return-Path: <markh.sj@gmail.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 540101B2DA6 for <codec@ietfa.amsl.com>; Mon,  4 May 2015 19:45:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.101
X-Spam-Level: 
X-Spam-Status: No, score=-0.101 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kXoidpYQndR9 for <codec@ietfa.amsl.com>; Mon,  4 May 2015 19:45:19 -0700 (PDT)
Received: from mail-ig0-x231.google.com (mail-ig0-x231.google.com [IPv6:2607:f8b0:4001:c05::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 21EF21B2D9C for <codec@ietf.org>; Mon,  4 May 2015 19:45:19 -0700 (PDT)
Received: by igblo3 with SMTP id lo3so98265190igb.0 for <codec@ietf.org>; Mon, 04 May 2015 19:45:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;  h=mime-version:sender:date:message-id:subject:from:to:content-type;  bh=G7AHsnFCnnuxNboYY3b5tZ+f27moRWwc++2A4pH1Eto=; b=JiOximhr809RRlKq8LlRLHIzDH7UeAC7WI+6gmhJP//KHEdnVVjYDJQDjfbikHnF8e hlWKx1tiapZmQ5CNxi8Jc6ExDYwVBr+mMMUgKnhFJuBy1ZlE0q0CCArx1oeSdLdbIdH0 TpYi3fH4wQMbMOjlegsCNRmgnZPAQpdzbRxDADzdshFLEGDAJmODq1WeN3PuwmxfOgVq IdI/k41UCXo0fJHbLKxJQ+Cc5ggQv3zzh7o4crkltzKyz2cW6d/ydW3zPwxsiVVIlXo+ lrdT7fsphBIplz2GYGZdoC9GatYJkCpeZvoVdCa71dZ9lLUuxIpimds9E8yXXDNpA2QW lLHw==
MIME-Version: 1.0
X-Received: by 10.50.132.71 with SMTP id os7mr840327igb.24.1430793918588; Mon, 04 May 2015 19:45:18 -0700 (PDT)
Sender: markh.sj@gmail.com
Received: by 10.107.19.130 with HTTP; Mon, 4 May 2015 19:45:18 -0700 (PDT)
Date: Mon, 4 May 2015 19:45:18 -0700
X-Google-Sender-Auth: Qv7JTxteJn2phQKT4QEgfGxnvy0
Message-ID: <CAMdZqKFo6wUiVizeQFWXjr80-27=r7H1MN+07sX0jm3Le=sLng@mail.gmail.com>
From: Mark Harris <mark.hsj@gmail.com>
To: "codec@ietf.org" <codec@ietf.org>
Content-Type: text/plain; charset=UTF-8
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/y5P4P7JtDlvxULsMQwux8roLZDw>
Subject: [codec] draft-ietf-codec-oggopus-07 packet sizes
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 May 2015 02:45:25 -0000

draft-ietf-codec-oggopus-07 adds some concrete limits to the packet
sizes that must be accepted, which I agree is a good idea, however
there are some issues with the specific text.

Section 6 states:
"Decoders SHOULD reject packets larger than 60 kB per channel, and
display a warning message, and MAY reject packets larger than 7.5 kB
per channel."

It is unclear whether "channel" here refers to output channels,
decoded channels, or physical channels (although output channels makes
the most sense).  Also, while the next paragraph appears to be a
justification for these numbers, the numbers in that paragraph are per
stream and not per channel.  The number of output channels may be
higher or lower than the number of streams so the connection is
unclear.  Additionally it is unclear whether this section applies only
to audio data packets or also to comment packets, which do not have
channels or streams but could be larger than audio data packets.

It is also unclear how this interacts with the Section 5.1.1.5
requirement that an Ogg Opus player play *any* Ogg Opus stream with a
channel mapping family of 0 or 1 (apparently including those with
large packets).  In order to comply with that, does that mean that
large packets may only be rejected when the channel mapping family is
not 0 or 1, or by applications that are not an "Ogg Opus player"?

It appears that the intent is to require Ogg Opus players to play any
valid Ogg Opus stream with channel mapping family 0 or 1 (maximum 8
channels) within limits that should be reasonable for any such stream,
allowing for resource-constrained implementations with fixed buffers
and a reasonably reliable test for ensuring that a particular stream
is likely to play even on resource constrained players as intended,
while still allowing streams that are limited only by what should be
considered malicious which may not be playable on all players.

To address these issues, I propose that the packet size required to be
processed by any implementation be stated as a simple 60 kB size and
be connected to the Section 5.1.1.5 requirement.  This is the size
suggested at the end of Section 6 and simplifies the requirement
without increasing the packet size that could be interpreted to
already be required to be played by a minimal player.  I further
suggest that the larger size at which audio data packets SHOULD be
rejected be set at 60 kB per stream rather than per "channel",
aligning with the largest possible valid packet without padding as
described at the beginning of the second paragraph of Section 6.
Because packets exceeding 60 kB per stream necessarily use padding in
excess of what would be required to make a VBR stream CBR, this also
aligns with Section 6 encoder recommendations and avoids the situation
where decoders SHOULD reject certain packets that an encoder not
violating any SHOULD or MUST might produce.

The new text about decoders displaying a warning message is also
worrying.  Typically a decoder does not have access to the
application's preferred display or localization facilities so this
seems inappropriate.  Even considering the higher layers it seems
unlikely that typical usage, such as background audio decoding in a
portable music player, game engine, or the like would have a facility
for displaying this kind of warning message during decoding.  Also if
a warning should be displayed when packets larger than 60 kB per
channel are rejected, why not when smaller packets are rejected?  For
these reasons, and because applications are already free to indicate
any issues that are important to their users in the manner most
appropriate for their application, I suggest omitting this text.

On another note, it appears that in the packet sizes in Section 6, kB
refers to 1024 bytes and MB refers to 1024*1024 bytes, whereas
everywhere else in the document (including kbps and Mbps in the same
paragraph) the standard SI decimal multiples are intended.  Because
the usage is limited I suggest just sticking with bytes/octets to
avoid confusion, rather than adding a sentence to clarify the mixed
usage.  If a 1024-byte unit is desired I believe that the traditional
designation is KB (or the newer KiB).

Below is a precise proposal for comment:

In Section 5.1.1.5 replace:

   An Ogg Opus player MUST play any Ogg Opus stream with a channel
   mapping family of 0 or 1, even if the number of channels does not
   match the physically connected audio hardware.

with:

   An Ogg Opus player MUST play any valid Ogg Opus stream with a
   channel mapping family of 0 or 1 that contains no packet larger
   than 61,440 octets, even if the number of channels does not match
   the physically connected audio hardware.

In Section 6 replace:

   Technically, valid Opus packets can be arbitrarily large due to the
   padding format, although the amount of non-padding data they can
   contain is bounded.  These packets might be spread over a similarly
   enormous number of Ogg pages.  Encoders SHOULD use no more padding
   than is necessary to make a variable bitrate (VBR) stream constant
   bitrate (CBR).  Decoders SHOULD avoid attempting to allocate
   excessive amounts of memory when presented with a very large packet.
   Decoders SHOULD reject packets larger than 60 kB per channel, and
   display a warning message, and MAY reject packets larger than 7.5 kB
   per channel.  The presence of an extremely large packet in the stream
   could indicate a memory exhaustion attack or stream corruption.

with:

   Technically, valid Opus packets can be arbitrarily large due to
   padding, although the amount of non-padding data that an audio data
   packet can contain is bounded.  These packets might be spread over
   a similarly enormous number of Ogg pages.  In an audio data packet,
   encoders SHOULD use no more padding than is necessary to make a
   variable bitrate (VBR) stream constant bitrate (CBR).  Decoders
   SHOULD reject audio data packets larger than 61,440 octets per
   stream; such packets necessarily contain more padding than needed
   for this purpose.  Decoders SHOULD avoid attempting to allocate
   excessive amounts of memory when presented with a very large
   packet, and MAY reject or partially process any packet larger
   than 61,440 octets.  The presence of an extremely large packet
   in the stream could indicate a memory exhaustion attack or stream
   corruption.

Replace:

   In an Ogg Opus stream, the largest possible valid packet that does
   not use padding has a size of (61,298*N - 2) octets, or about 60 kB
   per Opus stream.  With 255 streams, this is 15,630,988 octets
   (14.9 MB) and can span up to 61,298 Ogg pages, all but one of which
   will have a granule position of -1.  This is of course a very extreme
   packet, consisting of 255 streams, each containing 120 ms of audio
   encoded as 2.5 ms frames, each frame using the maximum possible
   number of octets (1275) and stored in the least efficient manner
   allowed (a VBR code 3 Opus packet).  Even in such a packet, most of
   the data will be zeros as 2.5 ms frames cannot actually use all
   1275 octets.  The largest packet consisting of entirely useful data
   is (15,326*N - 2) octets, or about 15 kB per stream.  This
   corresponds to 120 ms of audio encoded as 10 ms frames in either SILK
   or Hybrid mode, but at a data rate of over 1 Mbps, which makes little
   sense for the quality achieved.  A more reasonable limit is
   (7,664*N - 2) octets, or about 7.5 kB per stream.  This corresponds
   to 120 ms of audio encoded as 20 ms stereo CELT mode frames, with a
   total bitrate just under 511 kbps (not counting the Ogg encapsulation
   overhead).  With N=8, the maximum number of channels currently
   defined by mapping family 1, this gives a maximum packet size of
   61,310 octets, or just under 60 kB.  This is still quite
   conservative, as it assumes each output channel is taken from one
   decoded channel of a stereo packet.  An implementation could
   reasonably choose any of these numbers for its internal limits.

with:

   In an Ogg Opus stream, the largest possible valid audio data packet
   that does not use padding has a size of (61,298*N - 2) octets,
   although a valid comment header packet may be larger.  With 255 Opus
   streams, this is 15,630,988 octets and can span up to 61,298 Ogg
   pages, all but one of which will have a granule position of -1.
   This is of course a very extreme packet, consisting of 255 Opus
   streams, each containing 120 ms of audio encoded as 2.5 ms frames,
   each frame using the maximum possible number of octets (1275) and
   stored in the least efficient manner allowed (a VBR code 3 Opus
   packet).  Even in such a packet, most of the data will be zeros as
   2.5 ms frames cannot actually use all 1275 octets.

   The largest audio data packet consisting of entirely useful data is
   (15,326*N - 2) octets.  This corresponds to 120 ms of audio encoded
   as 10 ms frames in either SILK or Hybrid mode, but at a data rate
   of over 1 Mbps, which makes little sense for the quality achieved.

   A more reasonable limit is (7,664*N - 2) octets.  This corresponds
   to 120 ms of audio encoded as 20 ms stereo CELT mode frames, with
   a total bitrate just under 511 kbps (not counting the Ogg
   encapsulation overhead).  For mapping family 1, N=8 provides a
   reasonable upper bound, as it allows for each of the 8 possible
   output channels to be encoded in a separate stereo Opus stream.
   This gives a packet size of 61,310 octets, which is rounded up to a
   multiple of 1024 octets to give the packet size of 61,440 octets
   that is expected to be successfully processed by any implementation.

I also found that the fragment identifiers are no longer valid in the
informative references to the Vorbis specification.  This fixes those
and makes the host name consistent.

In Section 14.2 replace:

   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9

with:

   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-810004.3.9

Replace:

   https://xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2

with:

   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-132000A.2

 - Mark


From nobody Fri May  8 11:44:37 2015
Return-Path: <ietf-secretariat-reply@ietf.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C000B1A89AF for <codec@ietfa.amsl.com>; Tue, 28 Apr 2015 14:35:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id amzQfZYE0Dyi for <codec@ietfa.amsl.com>; Tue, 28 Apr 2015 14:35:20 -0700 (PDT)
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id A00E21A89F1 for <codec@ietf.org>; Tue, 28 Apr 2015 14:35:19 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: <codec@ietf.org>
X-Test-IDTracker: no
X-IETF-IDTracker: 6.0.2
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <20150428213519.16786.857.idtracker@ietfa.amsl.com>
Date: Tue, 28 Apr 2015 14:35:19 -0700
From: IETF Secretariat <ietf-secretariat-reply@ietf.org>
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/kENAvXJZnEA94B0CYd3J0_wR-7M>
X-Mailman-Approved-At: Fri, 08 May 2015 11:44:35 -0700
Subject: [codec] Milestones changed for codec WG
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Apr 2015 21:35:21 -0000

Changed milestone "Codec Standardization Guidelines to IESG
(Informational)", resolved as "Done".

Changed milestone "WGLC #2 on Codec specification", resolved as
"Done".

Changed milestone "Submit codec specification to IESG (Standards
Track)", resolved as "Done".

URL: https://datatracker.ietf.org/wg/codec/charter/


From nobody Fri May  8 11:44:38 2015
Return-Path: <ietf-secretariat-reply@ietf.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B5421A8998 for <codec@ietfa.amsl.com>; Tue, 28 Apr 2015 14:36:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZeU4i3-qnFuT for <codec@ietfa.amsl.com>; Tue, 28 Apr 2015 14:36:08 -0700 (PDT)
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 63FDC1A89F5 for <codec@ietf.org>; Tue, 28 Apr 2015 14:36:07 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: <codec@ietf.org>
X-Test-IDTracker: no
X-IETF-IDTracker: 6.0.2
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <20150428213607.16346.21168.idtracker@ietfa.amsl.com>
Date: Tue, 28 Apr 2015 14:36:07 -0700
From: IETF Secretariat <ietf-secretariat-reply@ietf.org>
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/SjugTzz9-AwRcPGT_ZFRHaucctQ>
X-Mailman-Approved-At: Fri, 08 May 2015 11:44:35 -0700
Subject: [codec] Milestones changed for codec WG
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Apr 2015 21:36:09 -0000

Changed milestone "Error and bugfix update(s) to RFC 6716", set state
to active from review, accepting new milestone.

URL: https://datatracker.ietf.org/wg/codec/charter/


From nobody Tue May 12 15:49:15 2015
Return-Path: <tterribe@xiph.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 08D751B29A4 for <codec@ietfa.amsl.com>; Tue, 12 May 2015 15:49:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.723
X-Spam-Level: *
X-Spam-Status: No, score=1.723 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, SPF_FAIL=0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PIviTzYdiKD5 for <codec@ietfa.amsl.com>; Tue, 12 May 2015 15:49:10 -0700 (PDT)
Received: from smtp.mozilla.org (mx1.scl3.mozilla.com [63.245.214.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 25C441B2850 for <codec@ietf.org>; Tue, 12 May 2015 15:49:10 -0700 (PDT)
Received: from localhost (localhost6.localdomain [127.0.0.1]) by mx1.mail.scl3.mozilla.com (Postfix) with ESMTP id 961D9C1867 for <codec@ietf.org>; Tue, 12 May 2015 22:49:09 +0000 (UTC)
X-Virus-Scanned: amavisd-new at mozilla.org
Received: from smtp.mozilla.org ([127.0.0.1]) by localhost (mx1.mail.scl3.mozilla.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ivtitIZJYcIs for <codec@ietf.org>; Tue, 12 May 2015 22:49:09 +0000 (UTC)
Received: from [10.252.28.249] (corp.mtv2.mozilla.com [63.245.221.32]) (Authenticated sender: tterriberry@mozilla.com) by mx1.mail.scl3.mozilla.com (Postfix) with ESMTPSA id 7CC22C144D for <codec@ietf.org>; Tue, 12 May 2015 22:49:09 +0000 (UTC)
Message-ID: <55528365.3090100@xiph.org>
Date: Tue, 12 May 2015 15:49:09 -0700
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 SeaMonkey/2.26
MIME-Version: 1.0
To: "codec@ietf.org" <codec@ietf.org>
References: <CAMdZqKFo6wUiVizeQFWXjr80-27=r7H1MN+07sX0jm3Le=sLng@mail.gmail.com>
In-Reply-To: <CAMdZqKFo6wUiVizeQFWXjr80-27=r7H1MN+07sX0jm3Le=sLng@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/IXqVlHKaHQoKmL9lSACGNTZ8dmw>
Subject: Re: [codec] draft-ietf-codec-oggopus-07 packet sizes
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 May 2015 22:49:14 -0000

Mark Harris wrote:
>     An Ogg Opus player MUST play any valid Ogg Opus stream with a
>     channel mapping family of 0 or 1 that contains no packet larger
>     than 61,440 octets, even if the number of channels does not match
>     the physically connected audio hardware.

This should probably forward-reference Section 6, i.e., simply by adding 
a "(See Section 6)" after "61,440 octets." Also, I assume you meant this 
restriction to only apply to audio data packets (in keeping with your 
new text in Section 6)?

> In Section 6 replace:
>
>     Technically, valid Opus packets can be arbitrarily large due to the
>     padding format, although the amount of non-padding data they can
>     contain is bounded.  These packets might be spread over a similarly
>     enormous number of Ogg pages.  Encoders SHOULD use no more padding
>     than is necessary to make a variable bitrate (VBR) stream constant
>     bitrate (CBR).  Decoders SHOULD avoid attempting to allocate
>     excessive amounts of memory when presented with a very large packet.
>     Decoders SHOULD reject packets larger than 60 kB per channel, and
>     display a warning message, and MAY reject packets larger than 7.5 kB
>     per channel.  The presence of an extremely large packet in the stream
>     could indicate a memory exhaustion attack or stream corruption.
>
> with:
>
>     Technically, valid Opus packets can be arbitrarily large due to
>     padding, although the amount of non-padding data that an audio data
>     packet can contain is bounded.  These packets might be spread over
>     a similarly enormous number of Ogg pages.  In an audio data packet,
>     encoders SHOULD use no more padding than is necessary to make a
>     variable bitrate (VBR) stream constant bitrate (CBR).  Decoders

Talking about doing something in a single audio data packet that affects 
an entire stream seems a little tortured. How about, "Encoders SHOULD 
pad audio data packets no more than necessary to make a variable bitrate 
(VBR) stream constant bitrate (CBR)"?

>     SHOULD reject audio data packets larger than 61,440 octets per
>     stream; such packets necessarily contain more padding than needed
>     for this purpose.  Decoders SHOULD avoid attempting to allocate
>     excessive amounts of memory when presented with a very large
>     packet, and MAY reject or partially process any packet larger
>     than 61,440 octets.  The presence of an extremely large packet

This limit is reasonable for channel mapping families 0 and 1 (7.5 kB 
per stream for each of a maximum of 8 non-useless streams), but not for 
channel mapping family 255, which may carry many more channels (and is 
currently the only way to do so). Suggest, "...and MAY reject or 
partially process any packet larger than 61,440 octets when used with 
channel mapping family 1, or a packet larger than 7,680 octets per 
stream when used with channel mapping family 255."

We could also put in a smaller MAY limit for channel mapping family 0, 
but I don't think that's that useful, as I suspect most players will a) 
only support mapping families 0 and 1, and b) use a fixed-size buffer 
independent of the file they're decoding.

>     in the stream could indicate a memory exhaustion attack or stream
>     corruption.
>
> Replace:
>
>     In an Ogg Opus stream, the largest possible valid packet that does
>     not use padding has a size of (61,298*N - 2) octets, or about 60 kB
>     per Opus stream.  With 255 streams, this is 15,630,988 octets
>     (14.9 MB) and can span up to 61,298 Ogg pages, all but one of which
>     will have a granule position of -1.  This is of course a very extreme
>     packet, consisting of 255 streams, each containing 120 ms of audio
>     encoded as 2.5 ms frames, each frame using the maximum possible
>     number of octets (1275) and stored in the least efficient manner
>     allowed (a VBR code 3 Opus packet).  Even in such a packet, most of
>     the data will be zeros as 2.5 ms frames cannot actually use all
>     1275 octets.  The largest packet consisting of entirely useful data
>     is (15,326*N - 2) octets, or about 15 kB per stream.  This
>     corresponds to 120 ms of audio encoded as 10 ms frames in either SILK
>     or Hybrid mode, but at a data rate of over 1 Mbps, which makes little
>     sense for the quality achieved.  A more reasonable limit is
>     (7,664*N - 2) octets, or about 7.5 kB per stream.  This corresponds
>     to 120 ms of audio encoded as 20 ms stereo CELT mode frames, with a
>     total bitrate just under 511 kbps (not counting the Ogg encapsulation
>     overhead).  With N=8, the maximum number of channels currently
>     defined by mapping family 1, this gives a maximum packet size of
>     61,310 octets, or just under 60 kB.  This is still quite
>     conservative, as it assumes each output channel is taken from one
>     decoded channel of a stereo packet.  An implementation could
>     reasonably choose any of these numbers for its internal limits.
>
> with:
>
>     In an Ogg Opus stream, the largest possible valid audio data packet
>     that does not use padding has a size of (61,298*N - 2) octets,
>     although a valid comment header packet may be larger.  With 255 Opus
>     streams, this is 15,630,988 octets and can span up to 61,298 Ogg
>     pages, all but one of which will have a granule position of -1.
>     This is of course a very extreme packet, consisting of 255 Opus
>     streams, each containing 120 ms of audio encoded as 2.5 ms frames,
>     each frame using the maximum possible number of octets (1275) and
>     stored in the least efficient manner allowed (a VBR code 3 Opus
>     packet).  Even in such a packet, most of the data will be zeros as
>     2.5 ms frames cannot actually use all 1275 octets.
>
>     The largest audio data packet consisting of entirely useful data is
>     (15,326*N - 2) octets.  This corresponds to 120 ms of audio encoded
>     as 10 ms frames in either SILK or Hybrid mode, but at a data rate
>     of over 1 Mbps, which makes little sense for the quality achieved.
>
>     A more reasonable limit is (7,664*N - 2) octets.  This corresponds
>     to 120 ms of audio encoded as 20 ms stereo CELT mode frames, with
>     a total bitrate just under 511 kbps (not counting the Ogg
>     encapsulation overhead).  For mapping family 1, N=8 provides a
>     reasonable upper bound, as it allows for each of the 8 possible
>     output channels to be encoded in a separate stereo Opus stream.

Suggest "decoded from" instead of "encoded in" to make this sound more 
reasonable.

>     This gives a packet size of 61,310 octets, which is rounded up to a
>     multiple of 1024 octets to give the packet size of 61,440 octets
>     that is expected to be successfully processed by any implementation.

Just wordsmithing, how about: "This gives a limit of 61,310 octets, 
rounded up to a multiple of 1024 to yield the maximum audio data packet 
size of 61,440 octets that any implementation is expected to be able to 
process successfully."

> I also found that the fragment identifiers are no longer valid in the
> informative references to the Vorbis specification.  This fixes those
> and makes the host name consistent.
>
> In Section 14.2 replace:
>
>     https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9
>
> with:
>
>     https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-810004.3.9
>
> Replace:
>
>     https://xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2
>
> with:
>
>     https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-132000A.2

Good catch, and thanks for all the review. If what I wrote above sounds 
fine, and there are no further comments/objections, I will make the 
changes by the end of the week.


From nobody Wed May 13 00:44:52 2015
Return-Path: <markh.sj@gmail.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 01D951A1A6D for <codec@ietfa.amsl.com>; Wed, 13 May 2015 00:44:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.7
X-Spam-Level: 
X-Spam-Status: No, score=0.7 tagged_above=-999 required=5 tests=[BAYES_50=0.8,  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id io5mQ1WFwsH3 for <codec@ietfa.amsl.com>; Wed, 13 May 2015 00:44:48 -0700 (PDT)
Received: from mail-ig0-x22b.google.com (mail-ig0-x22b.google.com [IPv6:2607:f8b0:4001:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 71B521A1A6B for <codec@ietf.org>; Wed, 13 May 2015 00:44:48 -0700 (PDT)
Received: by igbhj9 with SMTP id hj9so37590407igb.1 for <codec@ietf.org>; Wed, 13 May 2015 00:44:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;  h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=fSrSAeL6f25KK+mpG1Ii+8wWrWjeOKrhjMguJFHKkqw=; b=c+ojodMo6w5000TGB5Nd46nKXNeAph4dY6KxB0OoYfh/3WR5SxU2tV1UuGs8ahUP/N SpNSyzf0Y4mRBj1v1kbalAhkRej5ifXwbmC2aVpybNk2PFpYEXJD2e8u5yWFu75tj+5V h5bAvAwSGsWwsqWdCfFtCKT9abIpQiZZoNrjE4mZOrbuP5t6wksRqPhVrTn8PkX2IuCz CKyZLPHHZfM5MJSQ7xRYW5To2cNbPykVgUO0C1fMUE/5qmxTOuTy2o3FsfdG8JAiUQNy NMydT4nkeiWT+wuDk7b7CbExaZ/pj1fRr0h5h9PIlQm+6b042bI25TBn7RMzgexE+1dI yZ8w==
MIME-Version: 1.0
X-Received: by 10.50.109.138 with SMTP id hs10mr9081517igb.48.1431503087941; Wed, 13 May 2015 00:44:47 -0700 (PDT)
Sender: markh.sj@gmail.com
Received: by 10.107.19.130 with HTTP; Wed, 13 May 2015 00:44:47 -0700 (PDT)
In-Reply-To: <55528365.3090100@xiph.org>
References: <CAMdZqKFo6wUiVizeQFWXjr80-27=r7H1MN+07sX0jm3Le=sLng@mail.gmail.com> <55528365.3090100@xiph.org>
Date: Wed, 13 May 2015 00:44:47 -0700
X-Google-Sender-Auth: K1GdOPcavKEjIJiCWRiHd0LP3w0
Message-ID: <CAMdZqKF6xb13nuF+hro6gyKaAfKXoiUjGbaCddHOQqkVvB7mHQ@mail.gmail.com>
From: Mark Harris <mark.hsj@gmail.com>
To: "codec@ietf.org" <codec@ietf.org>
Content-Type: text/plain; charset=UTF-8
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/srAveI9Wb3pKD9NpfBVfkf_ZGf8>
Subject: Re: [codec] draft-ietf-codec-oggopus-07 packet sizes
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 May 2015 07:44:51 -0000

Timothy B. Terriberry wrote:
> Mark Harris wrote:
>>
>>     An Ogg Opus player MUST play any valid Ogg Opus stream with a
>>     channel mapping family of 0 or 1 that contains no packet larger
>>     than 61,440 octets, even if the number of channels does not match
>>     the physically connected audio hardware.
>
>
> This should probably forward-reference Section 6, i.e., simply by adding a
> "(See Section 6)" after "61,440 octets."

Sounds good.

> Also, I assume you meant this
> restriction to only apply to audio data packets (in keeping with your new
> text in Section 6)?

Actually my proposed text in Section 6 permits rejecting or partial
processing of any packet larger than 61,440 octets (including a
comment packet), although I'm certainly open to other suggestions.  It
was unclear whether the previous text applied to a comment packet, but
if the intent is to allow implementations with a fixed-size packet
buffer then it seems reasonable that the same buffer would be used for
all packets and that the same limit should apply.

>
>> In Section 6 replace:
>>
>>     Technically, valid Opus packets can be arbitrarily large due to the
>>     padding format, although the amount of non-padding data they can
>>     contain is bounded.  These packets might be spread over a similarly
>>     enormous number of Ogg pages.  Encoders SHOULD use no more padding
>>     than is necessary to make a variable bitrate (VBR) stream constant
>>     bitrate (CBR).  Decoders SHOULD avoid attempting to allocate
>>     excessive amounts of memory when presented with a very large packet.
>>     Decoders SHOULD reject packets larger than 60 kB per channel, and
>>     display a warning message, and MAY reject packets larger than 7.5 kB
>>     per channel.  The presence of an extremely large packet in the stream
>>     could indicate a memory exhaustion attack or stream corruption.
>>
>> with:
>>
>>     Technically, valid Opus packets can be arbitrarily large due to
>>     padding, although the amount of non-padding data that an audio data
>>     packet can contain is bounded.  These packets might be spread over
>>     a similarly enormous number of Ogg pages.  In an audio data packet,
>>     encoders SHOULD use no more padding than is necessary to make a
>>     variable bitrate (VBR) stream constant bitrate (CBR).  Decoders
>
>
> Talking about doing something in a single audio data packet that affects an
> entire stream seems a little tortured. How about, "Encoders SHOULD pad audio
> data packets no more than necessary to make a variable bitrate (VBR) stream
> constant bitrate (CBR)"?

I'm not real happy with that wording, which seems as though it could
be misinterpreted as saying that encoders SHOULD pad audio data
packets to make a VBR stream CBR, but not use any more padding than
needed for that.  The intent was to just add a clause to the previous
text to make it clear that it is not restricting the use of other
kinds of padding, such as padding at the end of the comment packet in
a stream that is already CBR.  How about: "Encoders SHOULD limit the
use of padding in audio data packets to no more than is necessary to
make a variable bitrate (VBR) stream constant bitrate (CBR)"?

>
>>     SHOULD reject audio data packets larger than 61,440 octets per
>>     stream; such packets necessarily contain more padding than needed
>>     for this purpose.  Decoders SHOULD avoid attempting to allocate
>>     excessive amounts of memory when presented with a very large
>>     packet, and MAY reject or partially process any packet larger
>>     than 61,440 octets.  The presence of an extremely large packet
>
>
> This limit is reasonable for channel mapping families 0 and 1 (7.5 kB per
> stream for each of a maximum of 8 non-useless streams), but not for channel
> mapping family 255, which may carry many more channels (and is currently the
> only way to do so). Suggest, "...and MAY reject or partially process any
> packet larger than 61,440 octets when used with channel mapping family 1, or
> a packet larger than 7,680 octets per stream when used with channel mapping
> family 255."

Does the 7,680 octets per stream apply to the comment packet?

It is my impression that no decoders are currently required to process
any packets at all from a stream with channel mapping 255.  If that is
not the intention, then perhaps this is what should be clarified.  Is
it the act of starting to process a stream with mapping family 255
that obligates a decoder to process the remaining packets?  That is,
is it your intention that a decoder implementation that can handle
packets up to 1 MB be required to immediately reject and not even
attempt to process a stream with channel mapping 255 and 137 Opus
streams, because it is possible (although unlikely) that it may
contain a packet larger than 1 MB that it would be required to
process?  The text that I proposed was intended to allow such an
implementation to successfully process a stream that does not actually
exceed the implementation's packet size limit.

>
> We could also put in a smaller MAY limit for channel mapping family 0, but I
> don't think that's that useful, as I suspect most players will a) only
> support mapping families 0 and 1, and b) use a fixed-size buffer independent
> of the file they're decoding.

Also, even in mapping family 0 (or mapping family 255 with 1 stream)
it is reasonable to have a comment packet with album cover art.

>
>
>>     in the stream could indicate a memory exhaustion attack or stream
>>     corruption.
>>
>> Replace:
>>
>>     In an Ogg Opus stream, the largest possible valid packet that does
>>     not use padding has a size of (61,298*N - 2) octets, or about 60 kB
>>     per Opus stream.  With 255 streams, this is 15,630,988 octets
>>     (14.9 MB) and can span up to 61,298 Ogg pages, all but one of which
>>     will have a granule position of -1.  This is of course a very extreme
>>     packet, consisting of 255 streams, each containing 120 ms of audio
>>     encoded as 2.5 ms frames, each frame using the maximum possible
>>     number of octets (1275) and stored in the least efficient manner
>>     allowed (a VBR code 3 Opus packet).  Even in such a packet, most of
>>     the data will be zeros as 2.5 ms frames cannot actually use all
>>     1275 octets.  The largest packet consisting of entirely useful data
>>     is (15,326*N - 2) octets, or about 15 kB per stream.  This
>>     corresponds to 120 ms of audio encoded as 10 ms frames in either SILK
>>     or Hybrid mode, but at a data rate of over 1 Mbps, which makes little
>>     sense for the quality achieved.  A more reasonable limit is
>>     (7,664*N - 2) octets, or about 7.5 kB per stream.  This corresponds
>>     to 120 ms of audio encoded as 20 ms stereo CELT mode frames, with a
>>     total bitrate just under 511 kbps (not counting the Ogg encapsulation
>>     overhead).  With N=8, the maximum number of channels currently
>>     defined by mapping family 1, this gives a maximum packet size of
>>     61,310 octets, or just under 60 kB.  This is still quite
>>     conservative, as it assumes each output channel is taken from one
>>     decoded channel of a stereo packet.  An implementation could
>>     reasonably choose any of these numbers for its internal limits.
>>
>> with:
>>
>>     In an Ogg Opus stream, the largest possible valid audio data packet
>>     that does not use padding has a size of (61,298*N - 2) octets,
>>     although a valid comment header packet may be larger.  With 255 Opus
>>     streams, this is 15,630,988 octets and can span up to 61,298 Ogg
>>     pages, all but one of which will have a granule position of -1.
>>     This is of course a very extreme packet, consisting of 255 Opus
>>     streams, each containing 120 ms of audio encoded as 2.5 ms frames,
>>     each frame using the maximum possible number of octets (1275) and
>>     stored in the least efficient manner allowed (a VBR code 3 Opus
>>     packet).  Even in such a packet, most of the data will be zeros as
>>     2.5 ms frames cannot actually use all 1275 octets.
>>
>>     The largest audio data packet consisting of entirely useful data is
>>     (15,326*N - 2) octets.  This corresponds to 120 ms of audio encoded
>>     as 10 ms frames in either SILK or Hybrid mode, but at a data rate
>>     of over 1 Mbps, which makes little sense for the quality achieved.
>>
>>     A more reasonable limit is (7,664*N - 2) octets.  This corresponds
>>     to 120 ms of audio encoded as 20 ms stereo CELT mode frames, with
>>     a total bitrate just under 511 kbps (not counting the Ogg
>>     encapsulation overhead).  For mapping family 1, N=8 provides a
>>     reasonable upper bound, as it allows for each of the 8 possible
>>     output channels to be encoded in a separate stereo Opus stream.
>
>
> Suggest "decoded from" instead of "encoded in" to make this sound more
> reasonable.

Ok.

>
>>     This gives a packet size of 61,310 octets, which is rounded up to a
>>     multiple of 1024 octets to give the packet size of 61,440 octets
>>     that is expected to be successfully processed by any implementation.
>
>
> Just wordsmithing, how about: "This gives a limit of 61,310 octets, rounded
> up to a multiple of 1024 to yield the maximum audio data packet size of
> 61,440 octets that any implementation is expected to be able to process
> successfully."

This is showing the close connection between a reasonable maximum
packet size for an audio data packet in a stream with mapping family 1
(61,310 octets using the above definition of reasonable) and the
packet size given earlier (for any kind of packet) that any conforming
decoder is expected to be able to process (61,440 octets).  At least
from the point of view of decoder requirements, the 61,440 octet
number is more of a minimum (minimum packet buffer size) than a
maximum.  The purpose of making the connection is to help a decoder
implementor make a better informed decision on whether to support more
than the minimum requirement, and help an encoder implementor
determine how much concern they should have about producing a stream
that may not play as intended on all conforming decoders.  Because
they depend on the point of view I avoided using words like limit,
minimum, and maximum here, but perhaps that made the wording awkward.

 - Mark


From nobody Wed May 13 05:46:33 2015
Return-Path: <tterribe@xiph.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 682651A8A76 for <codec@ietfa.amsl.com>; Wed, 13 May 2015 05:46:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.922
X-Spam-Level: 
X-Spam-Status: No, score=0.922 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, SPF_FAIL=0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T90tCRq28iRQ for <codec@ietfa.amsl.com>; Wed, 13 May 2015 05:46:30 -0700 (PDT)
Received: from smtp.mozilla.org (mx2.scl3.mozilla.com [63.245.214.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 68FFA1A8A75 for <codec@ietf.org>; Wed, 13 May 2015 05:46:30 -0700 (PDT)
Received: from localhost (localhost6.localdomain [127.0.0.1]) by mx2.mail.scl3.mozilla.com (Postfix) with ESMTP id EBC72C1280 for <codec@ietf.org>; Wed, 13 May 2015 12:46:29 +0000 (UTC)
X-Virus-Scanned: amavisd-new at mozilla.org
Received: from smtp.mozilla.org ([127.0.0.1]) by localhost (mx2.mail.scl3.mozilla.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kMTeo5ZmmAmL for <codec@ietf.org>; Wed, 13 May 2015 12:46:29 +0000 (UTC)
Received: from [172.17.0.70] (50-78-100-113-static.hfc.comcastbusiness.net [50.78.100.113]) (Authenticated sender: tterriberry@mozilla.com) by mx2.mail.scl3.mozilla.com (Postfix) with ESMTPSA id A837AC1268 for <codec@ietf.org>; Wed, 13 May 2015 12:46:29 +0000 (UTC)
Message-ID: <555347A5.9030601@xiph.org>
Date: Wed, 13 May 2015 05:46:29 -0700
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 SeaMonkey/2.26
MIME-Version: 1.0
To: "codec@ietf.org" <codec@ietf.org>
References: <CAMdZqKFo6wUiVizeQFWXjr80-27=r7H1MN+07sX0jm3Le=sLng@mail.gmail.com> <55528365.3090100@xiph.org> <CAMdZqKF6xb13nuF+hro6gyKaAfKXoiUjGbaCddHOQqkVvB7mHQ@mail.gmail.com>
In-Reply-To: <CAMdZqKF6xb13nuF+hro6gyKaAfKXoiUjGbaCddHOQqkVvB7mHQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/5yJx-0nZPCX5vPpv_YVmqo3T9H4>
Subject: Re: [codec] draft-ietf-codec-oggopus-07 packet sizes
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 May 2015 12:46:32 -0000

Mark Harris wrote:
> Actually my proposed text in Section 6 permits rejecting or partial
> processing of any packet larger than 61,440 octets (including a
> comment packet), although I'm certainly open to other suggestions.  It

Yes, I don't think that's a good idea. Or rather, I certainly don't 
think rejecting a comment packet of that size is a good idea. Since they 
can contain multiple album art pictures, etc., it is not too hard to 
make one larger than this limit, and such files are commonly found in 
the wild.

I know Rockbox, for example, has code to skip over very large comment 
headers (which I suppose would fall under your definition of 'partially 
process'), which I think is fine for embedded devices that have a 
limited ability to display album art anyway, but given that the normal 
behavior of failing to decode a valid comment header is to reject the 
entire file, I think telling people they can reject comment headers over 
60 kB will lead to interoperability problems. I don't have good advice 
for a limit to specify here for rejection, but that doesn't mean we 
should give bad advice.

> a stream that is already CBR.  How about: "Encoders SHOULD limit the
> use of padding in audio data packets to no more than is necessary to
> make a variable bitrate (VBR) stream constant bitrate (CBR)"?

No objection from me.

> Does the 7,680 octets per stream apply to the comment packet?

I don't think it should, for the above reasons.

> It is my impression that no decoders are currently required to process
> any packets at all from a stream with channel mapping 255.  If that is
> not the intention, then perhaps this is what should be clarified.  Is

Certainly we do not obligate people to implement support for channel 
mapping 255, and certainly do not obligate support for 255 channels. But 
I think it is a mistake to look at this from the decoder's perspective. 
The point of this limit is to tell an _encoder_ what it can do an expect 
to have it work. Telling an encoder, "Well, there may be a packet size 
limit in the decoder, but we don't know what it is, and something may 
decide to play your stream with too small of a buffer, and fail in the 
middle, or not," is not exactly helpful. If we're going to do that, we 
might as well not specify any limits at all.


From nobody Mon May 25 23:59:52 2015
Return-Path: <markh.sj@gmail.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C4071ACC82 for <codec@ietfa.amsl.com>; Mon, 25 May 2015 23:59:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.7
X-Spam-Level: 
X-Spam-Status: No, score=0.7 tagged_above=-999 required=5 tests=[BAYES_50=0.8,  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8Y73ywpt51EJ for <codec@ietfa.amsl.com>; Mon, 25 May 2015 23:59:46 -0700 (PDT)
Received: from mail-ie0-x22b.google.com (mail-ie0-x22b.google.com [IPv6:2607:f8b0:4001:c03::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C73CA1AC407 for <codec@ietf.org>; Mon, 25 May 2015 23:59:45 -0700 (PDT)
Received: by iesa3 with SMTP id a3so85377450ies.2 for <codec@ietf.org>; Mon, 25 May 2015 23:59:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;  h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=IH9Qhmk46MboHEZS4Y/jWnnPm0vw1UJejCHgk7GTe0w=; b=N2lHJOmnR0D6RyXYjqw5myT8bCqggM36U3mRGaSPPThuGGq3RGJJRliHSS92LsI2lx 1NDEwsDURmzPbBE+YNx2ZUtoeF+oC9f34FD/8hABYU0CYYTRCcn0YdwpQ+WC5pRbFPIx ojbxmTlzawFz0p8N9pwVDUZtzfzbOvZuqXK5G02gLPstJgcQWg1H+vOznws1G2qcMeQt lW3f3/phRXe2mUTvM4BUVXzJq8It0XGuD5W3nD772Or4LNZOTqJJ4dr5taJ4n8uMF2dT St8MmzPrnYiHyVxvwmo1FmyWBQmqj6ZhRAxDYM6wWpNg9gubHatpCzLGyGQaavzhklQc os2Q==
MIME-Version: 1.0
X-Received: by 10.107.25.199 with SMTP id 190mr18591671ioz.11.1432623585275; Mon, 25 May 2015 23:59:45 -0700 (PDT)
Sender: markh.sj@gmail.com
Received: by 10.107.26.1 with HTTP; Mon, 25 May 2015 23:59:45 -0700 (PDT)
In-Reply-To: <555347A5.9030601@xiph.org>
References: <CAMdZqKFo6wUiVizeQFWXjr80-27=r7H1MN+07sX0jm3Le=sLng@mail.gmail.com> <55528365.3090100@xiph.org> <CAMdZqKF6xb13nuF+hro6gyKaAfKXoiUjGbaCddHOQqkVvB7mHQ@mail.gmail.com> <555347A5.9030601@xiph.org>
Date: Mon, 25 May 2015 23:59:45 -0700
X-Google-Sender-Auth: EG-zgGu2F2IhCcE7xoJhpH4qqmY
Message-ID: <CAMdZqKFk76OJrFyrdpxOYJYk+-=c-jPceLG5kscOkotE5P29PA@mail.gmail.com>
From: Mark Harris <mark.hsj@gmail.com>
To: "codec@ietf.org" <codec@ietf.org>
Content-Type: text/plain; charset=UTF-8
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/3CHXnklMJa25hYI_U7eSM0B5t8k>
Subject: Re: [codec] draft-ietf-codec-oggopus-07 packet sizes
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 May 2015 06:59:51 -0000

I've incorporated the earlier comments and made separate limits for
the comment header, clarifying that comment header packets exceeding
61,440 octets may be partially processed but not rejected entirely
unless they exceed a much larger size.  This size was chosen somewhat
arbitrarily to be 120 * 2^20 octets, which should be sufficient for
any reasonable album art in addition to other comments.

In Section 5.1.1.5 replace:

   An Ogg Opus player MUST play any Ogg Opus stream with a channel
   mapping family of 0 or 1, even if the number of channels does not
   match the physically connected audio hardware.

with:

   An Ogg Opus player MUST play any valid Ogg Opus stream with a
   channel mapping family of 0 or 1 that contains a comment header
   no larger than 125,829,120 octets (see Section 5.2), and no audio
   data packet larger than 61,440 octets (see Section 6), even if the
   number of channels does not match the physically connected audio
   hardware.

In Section 5.2, immediately before Section 5.2.1, insert:

   The comment header can be arbitrarily large and might be spread
   over a large number of Ogg pages.  Decoders SHOULD avoid attempting
   to allocate excessive amounts of memory when presented with a very
   large comment header.  To accomplish this, decoders MAY reject a
   comment header larger than 125,829,120 octets, and MAY ignore
   individual comments that are not fully contained within the first
   61,440 octets of the comment header or that would otherwise have
   no impact.

Replace Section 6 with:

6. Audio Data Packet Size Limits

   Technically, valid audio data packets can be arbitrarily large due
   to the padding format, although the amount of non-padding data they
   can contain is bounded.  These packets might be spread over a
   similarly enormous number of Ogg pages.  Encoders SHOULD limit the
   use of padding in audio data packets to no more than is necessary
   to make a variable bitrate (VBR) Ogg Opus stream constant bitrate
   (CBR).  Decoders SHOULD reject audio data packets larger than
   61,440 octets per Opus stream; such packets necessarily contain
   more padding than needed for this purpose.  Decoders SHOULD avoid
   attempting to allocate excessive amounts of memory when presented
   with a very large packet.  Decoders MAY reject or partially process
   audio data packets larger than 61,440 octets in an Ogg Opus stream
   with channel mapping family 1, or in any Ogg Opus stream if the
   packet is also larger than 7680 octets per Opus stream.  The
   presence of an extremely large packet in the stream could indicate
   a memory exhaustion attack or stream corruption.

   In an Ogg Opus stream, the largest possible valid audio data packet
   that does not use padding has a size of (61,298*N - 2) octets.
   With 255 Opus streams, this is 15,630,988 octets and can span up to
   61,298 Ogg pages, all but one of which will have a granule position
   of -1.  This is of course a very extreme packet, consisting of 255
   Opus streams, each containing 120 ms of audio encoded as 2.5 ms
   frames, each frame using the maximum possible number of octets
   (1275) and stored in the least efficient manner allowed (a VBR code
   3 Opus packet).  Even in such a packet, most of the data will be
   zeros as 2.5 ms frames cannot actually use all 1275 octets.

   The largest audio data packet consisting of entirely useful data is
   (15,326*N - 2) octets.  This corresponds to 120 ms of audio encoded
   as 10 ms frames in either SILK or Hybrid mode, but at a data rate
   of over 1 Mbps, which makes little sense for the quality achieved.

   A more reasonable audio data packet size limit is (7,664*N - 2)
   octets.  This corresponds to 120 ms of audio encoded as 20 ms
   stereo CELT mode frames, with a total bitrate just under 511 kbps
   (not counting the Ogg encapsulation overhead).  For mapping family
   1, N=8 provides a reasonable upper bound, as it allows for each of
   the 8 possible output channels to be decoded from a separate stereo
   Opus stream.  This gives a size of 61,310 octets, which is rounded
   up to a multiple of 1024 octets to yield the audio data packet size
   of 61,440 octets that any implementation is expected to be able to
   process successfully.

In Section 14.2 replace:
   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9
with:
   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-810004.3.9

Replace:
   https://xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2
with:
   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-132000A.2

 - Mark

