Ticket #15 (assigned defect)

Opened 3 years ago

Last modified 2 years ago

Long lists cause stack overflow

Reported by: John Croisant <jacius@…> Owned by: deveiant
Priority: high Milestone: Bugfixes
Component: API Version: 1.0.0
Severity: major Keywords: regexp overflow bug
Cc:

Description

BlueCloth breaks with a stack overflow if it tries to parse a list (either ordered or unordered) that is too long:

RegexpError: Stack overflow in regexp matcher: /
        ^						# Start of line
        <(p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|script)	# Start tag: \2
        \b						# word break
        (.*\n)*?				# Any number of lines, minimal match
        <\/\1>					# Matching end tag
        [ ]*					# trailing spaces
        (?=\n+|\Z)				# End of line or document
      /ix
    from ./bluecloth.rb:342:in `gsub!'
    from ./bluecloth.rb:342:in `hide_html_blocks'
    from ./bluecloth.rb:248:in `apply_block_transforms'
    from ./bluecloth.rb:202:in `to_html'

The number of list items it can handle is inversely related to how long the list items are. With 26-character lines (eg. 'a'..'z'), BlueCloth will break on 78 items or above.

I've attached a little script to hopefully help test this. You can vary the item length and test a range to see how many list items BlueCloth can handle.

This might be a bug in ruby itself, in which case I request that you pass it upstream. Even then, there might be a way that BlueCloth could get around it.

I originally noticed this while maintaining a wiki running on Instiki 0.9.1 (http://www.instiki.org). The version string for the BlueCloth source used by Instiki is:

#  $Id: bluecloth.rb,v 1.3 2004/05/02 15:56:33 webster132 Exp $

Attachments

blueclothtest.rb (1.3 kB) - added by deveiant 2 years ago.
Flexible script that demonstrates/tests bug

Change History

Changed 3 years ago by deveiant

  • status changed from new to assigned

Date: 2005-05-11 20:49[[br]] Sender: Michael Granger
Logged In: YES
user_id=158

Thanks for this report. This is a problem with Ruby's regexp implementation, and I've yet to find a workaround for it for this particular case. I've started rewriting the matching for lists to not use one big monolithic Regexp, but it's been slow going so far. I'll hopefully have this done for the next release.

I've seen this before (bug #8), and passed it on back then, but didn't hear anything back. I'll try again with the new bug-reporting stuff.

Changed 3 years ago by deveiant

  • version changed from 1.0.0fc2 to 1.0.0

Added tests to assure that my changes to lists don't break stuff as the first step of fixing this.

Changed 2 years ago by deveiant

  • milestone changed from Markdown 1.0.1 to Bugfixes

Changed 2 years ago by deveiant

Flexible script that demonstrates/tests bug

Note: See TracTickets for help on using tickets.