CVE-2023-0667: Wireshark MSMMS parsing buffer overflow#

AHA! has discovered an issue with Wireshark from The Wireshark Foundation, and is issuing this disclosure in accordance with AHA!’s standard disclosure policy today, on Monday, June 5, 2023. CVE-2023-0667 has been assigned to this issue.

Any questions about this disclosure should be directed to [email protected].

Executive Summary#

Due to failure in validating the length provided by an attacker-crafted MSMMS packet, Wireshark version 4.0.5 and prior, in an unusual configuration, is susceptible to a heap-based buffer overflow, and possibly code execution in the context of the process running Wireshark. CVE-2023-0667 appears to be an instance of CWE-122.

Technical Details#

On line 391, a command id is retrieved from the packet at 0x24.

/wireshark/epan/dissectors/packet-ms-mms.c

 388
 389     /* Read command ID and direction now so can give common command header a
 390        descriptive label */
 391     command_id = tvb_get_letohs(tvb, 36);
 392     command_dir = tvb_get_letohs(tvb, 36+2);
 393
 394
 395     /*************************/
 396     /* Common command header */

Then on line 441 a length is retrieved from the msmms packet payload. In our crash file, this length value is 0x4.

/wireshark/epan/dissectors/packet-ms-mms.c

 436     /* Timestamp */
 437     proto_tree_add_item(msmms_common_command_tree, hf_msmms_command_timestamp, tvb, offset, 8, ENC_LITTLE_ENDIAN);
 438     offset += 8;
 439
 440     /* Another length remaining field... */
 441     length_remaining = tvb_get_letohl(tvb, offset);
 442     proto_tree_add_item(msmms_common_command_tree, hf_msmms_command_length_remaining2, tvb, offset, 4, ENC_LITTLE_ENDIAN);
 443     offset += 4;
 444

Following, on line 471, the length is multiplied by 8, then 8 is subtracted from the new total leaving a value of 0x18 (24).

/wireshark/epan/dissectors/packet-ms-mms.c

 461     /* Show summary in info column */
 462     col_append_fstr(pinfo->cinfo, COL_INFO,
 463                     "seq=%03u: %s %s",
 464                     sequence_number,
 465                     (command_dir == TO_SERVER) ? "-->" : "<--",
 466                     (command_dir == TO_SERVER) ?
 467                         val_to_str_const(command_id, to_server_command_vals, "Unknown") :
 468                         val_to_str_const(command_id, to_client_command_vals, "Unknown"));
 469
 470     /* Adjust length_remaining for command-specific details */
 471     length_remaining = (length_remaining*8) - 8;
 472

Following the length remaining calculation, the command_id retrieved earlier is used to determine the command type and on line 480, the dissect_client_transport_info is called

/wireshark/epan/dissectors/packet-ms-mms.c

 473     /* Now parse any command-specific params */
 474     if (command_dir == TO_SERVER)
 475     {
 476         /* Commands to server */
 477         switch (command_id)
 478         {
 479             case SERVER_COMMAND_TRANSPORT_INFO:
 480                 dissect_client_transport_info(tvb, pinfo, msmms_tree,
 481                                               offset, length_remaining);
 482                 break;

Entering the dissect_client_transport_info function, the length_remaining is 24 and the offset is 40.

/wireshark/epan/dissectors/packet-ms-mms.c

 715 static void dissect_client_transport_info(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree,
 716                                           guint offset, guint length_remaining)
 717 {
 718     char    *transport_info;
 719     guint   ipaddr[4];
 720     char    protocol[3+1] = "";
 721     guint   port;
 722     int     fields_matched;
 723

On line 736, the length_remaining (at this point still equalling 24, has 20 subtracted from it, leaving 4 and then being passed as the length value to the tvb_get_string_enc function and the offset value equalling 60.

/wireshark/epan/dissectors/packet-ms-mms.c

 734
 735     /* Extract and show the string in tree and info column */
 736     transport_info = tvb_get_string_enc(pinfo->pool, tvb, offset, length_remaining - 20, ENC_UTF_16|ENC_LITTLE_ENDIAN);
 737
 738     proto_tree_add_string_format(tree, hf_msmms_command_client_transport_info, tvb,
 739                                  offset, length_remaining-20,
 740                                  transport_info, "Transport: (%s)", transport_info);
 741
 742     col_append_fstr(pinfo->cinfo, COL_INFO, " (%s)",
 743                     format_text(pinfo->pool, (guchar*)transport_info, length_remaining - 20));
 744
 745
 746     /* Try to extract details from this string */
 747     fields_matched = sscanf(transport_info, "%*c%*c%u.%u.%u.%u%*c%3s%*c%u",
 748                             &ipaddr[0], &ipaddr[1], &ipaddr[2], &ipaddr[3],
 749                             protocol, &port);

tvb_get_utf_16_string then calls get_utf_16_string, passing the length value (4).

/wireshark/epan/tvbuff.c

2840 static guint8 *
2841 tvb_get_utf_16_string(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset, gint length, const guint encoding)
2842 {
2843         const guint8  *ptr;
2844
2845         ptr = ensure_contiguous(tvb, offset, length);
2846         return get_utf_16_string(scope, ptr, length, encoding);
2847 }

Following the above, in get_utf_16_string on line 745, the strbuf variable is initialized through the wmem_strbuf_new_sized call.

/wireshark/epan/charsets.c

 738 get_utf_16_string(wmem_allocator_t *scope, const guint8 *ptr, gint length, const guint encoding)
 739 {
 740     wmem_strbuf_t *strbuf;
 741     gunichar2      uchar2, lead_surrogate;
 742     gunichar       uchar;
 743     gint           i;       /* Byte counter for string */
 744
 745     strbuf = wmem_strbuf_new_sized(scope, length+1);
 746
 747     for(i = 0; i + 1 < length; i += 2) {
 748         if (encoding == ENC_BIG_ENDIAN)
 749             uchar2 = pntoh16(ptr + i);
 750         else
 751             uchar2 = pletoh16(ptr + i);
 752

In the wmem_strbuf_new_sized function, the strbuf is initially allocated with an 0x10 size.

wireshark/wsutil/wmem/wmem_strbuf.c

 30 wmem_strbuf_t *
 31 wmem_strbuf_new_sized(wmem_allocator_t *allocator,
 32                       size_t alloc_size)
 33 {
 34     wmem_strbuf_t *strbuf;
 35
 36     strbuf = wmem_new(allocator, wmem_strbuf_t);
 37
 38     strbuf->allocator = allocator;
 39     strbuf->len       = 0;
 40     strbuf->alloc_size = alloc_size ? alloc_size : DEFAULT_MINIMUM_SIZE;
 41
 42     strbuf->str    = (gchar *)wmem_alloc(strbuf->allocator, strbuf->alloc_size);
 43     strbuf->str[0] = '\0';
 44
 45     return strbuf;
 46 }

Then, back in get_utf_16_string function, we enter the for loop which helps determine the size of strbuf. In the attached crash this for loop is iterated through twice both times going through line 777 and hitting the wmem_strbuf_append_unichar function.

wireshark/epan/charsets.c

 747     for(i = 0; i + 1 < length; i += 2) {
 748         if (encoding == ENC_BIG_ENDIAN)
 749             uchar2 = pntoh16(ptr + i);
 750         else
 751             uchar2 = pletoh16(ptr + i);
 752
 753         if (IS_LEAD_SURROGATE(uchar2)) {
 754             /*
 755              * Lead surrogate.  Must be followed by
 756              * a trail surrogate.
 757              */
 758             i += 2;
 759             if (i + 1 >= length) {
 760                 /*
 761                  * Oops, string ends with a lead surrogate.
 762                  *
 763                  * Insert a REPLACEMENT CHARACTER to mark the error,
 764                  * and quit.
 765                  */
 766                 wmem_strbuf_append_unichar(strbuf, UNREPL);
 767                 break;
 768             }
 769             lead_surrogate = uchar2;
 770             if (encoding == ENC_BIG_ENDIAN)
 771                 uchar2 = pntoh16(ptr + i);
 772             else
 773                 uchar2 = pletoh16(ptr + i);
 774             if (IS_TRAIL_SURROGATE(uchar2)) {
 775                 /* Trail surrogate. */
 776                 uchar = SURROGATE_VALUE(lead_surrogate, uchar2);
 777                 wmem_strbuf_append_unichar(strbuf, uchar);
 778             } else {

Each call to wmem_strbuf_append_unichar incrementing the strbuf->len value (leaving a length of 2).

`wireshark/wsutil/wmem/wmem_strbuf.c

234 wmem_strbuf_append_unichar(wmem_strbuf_t *strbuf, const gunichar c)
235 {
236     gchar buf[6];
237     size_t charlen;
238
239     charlen = g_unichar_to_utf8(c, buf);
240
241     wmem_strbuf_grow(strbuf, charlen);
242
243     memcpy(&strbuf->str[strbuf->len], buf, charlen);
244     strbuf->len += charlen;
245     strbuf->str[strbuf->len] = '\0';
246 }

Finally, at the bottom of the loop within get_utf_16_string, the wmem_strbuf_finalize function is called.

/wireshark/epan/charsets.c

 804     }
 805
 806     /*
 807      * If i < length, this means we were handed an odd number of bytes,
 808      * so we're not a valid UTF-16 string; insert a REPLACEMENT CHARACTER
 809      * to mark the error.
 810      */
 811     if (i < length)
 812         wmem_strbuf_append_unichar(strbuf, UNREPL);
 813     return (guint8 *) wmem_strbuf_finalize(strbuf);
 814 }

The wmem_strbuf_finalize() function then calls wmem_realloc() setting the size of the strbuf->str buffer to strbuf->len+1 (equalling 3)

wireshark/wsutil/wmem/wmem_strbuf.c

382 char *
383 wmem_strbuf_finalize(wmem_strbuf_t *strbuf)
384 {
385     if (strbuf == NULL)
386         return NULL;
387
388     char *ret = (char *)wmem_realloc(strbuf->allocator, strbuf->str, strbuf->len+1);
389
390     wmem_free(strbuf->allocator, strbuf);
391
392     return ret;
393 }

Following the above, a ptr to the buffer is returned to the dissect_client_transport_info function where it is later used with the length of 4 bytes allowing for a 1 byte read within the format_text function.

Below is a Base64 encoded blob of the proof of concept. Note that this must be run with fuzzshark as tshark will not crash.

Ex4JdM76C7DO+guwTU1TDXr/igDv/9+KigEAAA0BAAAEAAAAAgADAAAAAAAAAAABN
AAYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2wAa29vb29vb29vbEz4JAc4H

Attacker Value#

Passing the above blob to fuzzshark will trigger a heap overflow, and any crash in fuzzshark is necessarily is a bug in Wireshark library code, including the Wireshark GUI application and TShark, as they all hit the same code paths. Therefore, we’re confident that a specially crafted MSMSS packet that implements this crash behavior is exploitable.

According to comments in issue 19086, “Technically it is possible to run a production release with WIRESHARK_DEBUG_WMEM_OVERRIDE=simple and get the buffer overread, but that doesn’t really fall into the category of ‘a unsuspecting naive user could hit this.’” In other words, it appears unlikely to reliably exploit this issue in a normal production environment, since the default configuration of Wireshark does not expose the affected codepath in the same way that fuzzshark does.

Credit#

This issue is being disclosed through the AHA! CNA and is credited to: zenofex and WanderingGlitch

Timeline#

Note, while we expected to publicly disclose this issue in early July (60 days after disclosure to The Wireshark Foundation), the vendor publicized the issue on May 22nd in issue 19086.

  • 2023-04-27 (Wed): Initial findings presented at the regularly scheduled meeting 0x00c7.
  • 2023-05-17 (Wed): PoC validated and analysis completed for disclosure.
  • 2023-05-18 (Thu): Disclosed to the vendor via email at [email protected].
  • 2023-05-18 (Thu): Vendor acknowledged, and opened issue 19086 to address.
  • 2023-05-18 (Thu): Patch merged to the release-3.6 and release-4.0 branches of Wireshark.
  • 2023-05-22 (Mon): Issue 19086 made public by the vendor.
  • 2023-06-06 (Tue): Public disclosure of CVE-2023-0667