Ticket #40 (assigned defect)

Opened 2 years ago

Last modified 2 years ago

Regexps incorrectly use Shift_JIS setting

Reported by: deveiant Owned by: deveiant
Priority: normal Milestone: Bugfixes
Component: API Version: 1.0.0fc1
Severity: major Keywords: shift_jis regexp x flag
Cc:

Description

on behalf of an anonymous user on RubyForge?

Please remove 'Shift_JIS' option in Regexp.

bluecloth.rb is using 'Shift_JIS' option in Regexp like this:

ReferenceImageRegexp = %r{
    (                   # Whole match = $1
        !\[ (.*?) \]    # Alt text = $2
        [ ]?            # Optional space
        (?:\n[ ]*)?     # One optional newline + spaces
        \[ (.*?) \]     # id = $3
    )
  }xs

s (in xs above) mean 'Shift_JIS'. 'Shift_JIS' is one of Japanese Language character code.

When this option is on, I found that a text written in EUC-JP(other Japanese character code) sometimes cannot be correctly converted.

for example:

![EUC-JPtext](images/foo.jpg "EUC-JPtext")

converted to 2 ways

correct:

<p><img src="images/foo.jpg" alt="EUC-JPtext" title="EUC-JPtext"/></p>

wrong:

<p>!<a href="images/foo.jpg" title="EUC-JPtext">EUC-JPtext</a></p>

If EUC-JPtext ends with 0xe0 - 0xfc, wrong pattern happened.

I think 'Shift_JIS' option is not needed in Regexp. Please remove 'Shift_JIS' option. I added a patch.

Attachments

bluecloth_remove_shiftjis.diff (396 bytes) - added by deveiant 2 years ago.
Strip unnecessary 'x' flag from Regexps

Change History

Changed 2 years ago by deveiant

Strip unnecessary 'x' flag from Regexps

Changed 2 years ago by deveiant

  • status changed from new to assigned
Note: See TracTickets for help on using tickets.