EllisLab text mark

This is an archived forum and may no longer be relevant. The active forums are here.

Advanced Search
1 of 2
1
Links aren’t formatting properly with parenthesis and percentage signs.
Posted: 02 February 2008 09:11 PM
Avatar
Joined: 2008-01-15
134 posts

A member was trying to link to the Wikipedia article which can be reached via EITHER of these two links:

http://en.wikipedia.org/wiki/Steppenwolf_(band)

or

http://en.wikipedia.org/wiki/Steppenwolf_(band)

(I guess I’m going to have to put a space in between all of the symbols at the end so that you can understand what I’m trying to link to.  This is the real link (if you remove the spaces):

http://en.wikipedia.org/wiki/Steppenwolf _ % 2 8 b a n d % 2 9

The EE software keeps altering the percentage sign followed by those two number into something mangled…)

When the link is created with the parenthesis method, even if we enter it correctly like this:

[url=http://en.wikipedia.org/wiki/Steppenwolf_(band)]Steppenwolf[/url]

It automatically removes the parenthesis from around the word “band” and links to this:

http://en.wikipedia.org/wiki/Steppenwolf_band

And when using the percentage sign method like this:

[url=http://en.wikipedia.org/wiki/Steppenwolf_(band)]Steppenwolf[/url]

It comes out mangled and links to this:

http://en.wikipedia.org/wiki/Steppenwolf_⢺nd;)

Any idea why this is happening or how I can fix it?  I successfully created the link on these forums here once I thought, but it appears to be doing the same thing now.  Check it out:

Link 1: Steppenwolf

Link 2: Steppenwolf

This is obviously a big problem since Wikipedia is so popular.  Any idea on how to fix this?  I didn’t think it happened on your official forums but now that I see it does this may even be a bug report.

EDIT 1: Here is another link that won’t format properly in EE:

http://tv.yahoo.com/show/35099/news/urn:newsml:eonlinekristen.com:20080202:TV-6a829db24533e7aa026268d0738c29d5__ER:1;_ylt=AuK7ETKJIdGTP7M9Gu47UVyAo9EF

EDIT 2: Parenthesis don’t work when trying to hotlink an image either.

 Signature 

Lone Shadow Gaming Community
LSGC Forums

Posted: 03 February 2008 09:43 AM   [ # 1 ]
Avatar
Joined: 2002-06-03
6840 posts

Parenthesis aren’t allowed in URLs in ExpressionEngine because in certain circumstances it can cause parts of the URL to be interpreted as code, and are used as part of cross-site scripting hacks.  Unfortunately because some browsers will even interpret URL encoded parenthesis in this manner and still allow code execution, ExpressionEngine takes the high road erring on the side of caution.  URL encoding makes invalid characters safe for transport in a URL, but note safe to use.  The action ExpressionEngine is taking keeps you and your site’s visitors safe.  Of course, this won’t happen to URLs that you enter into weblog entries since we can assume that people with access to those have a higher level of trust than it is safe to assume for forums, wikis, comments, etc.  We know it can be annoying, so we continue to make improvements to the security checks when we can to allow desired information to pass, but only if it does not present a security risk.  So, this is known behavior, and an unintended consequence of maintaining effective armor against the ever increasing prevalence of XSS attacks.

 Signature 
Posted: 03 February 2008 06:57 PM   [ # 2 ]
Avatar
Joined: 2008-01-15
134 posts

How did IPB protect against such things?  I can’t imagine at their high version they’re leaving their customers vulnerable but they worked properly when I was using that?

 Signature 

Lone Shadow Gaming Community
LSGC Forums

Posted: 03 February 2008 09:00 PM   [ # 3 ]
Avatar
Joined: 2002-06-03
6840 posts

I do not know much about IPB, but they have had a history of security issues.

 Signature 
Posted: 04 February 2008 01:03 AM   [ # 4 ]
Avatar
Joined: 2008-01-15
134 posts
Derek Jones - 04 February 2008 02:00 AM

I do not know much about IPB, but they have had a history of security issues.

Well is there at least a way for people to copy the plain text in there?  As you can see in my above example EE changes some of the characters so that people couldn’t even copy and paste the text…we just cannot link to any articles that have that (which is a decent amount).

 Signature 

Lone Shadow Gaming Community
LSGC Forums

Posted: 04 February 2008 10:13 AM   [ # 5 ]
Avatar
Joined: 2002-06-03
6840 posts

Yes, if you disable the “Auto-convert URLs and email addresses into links?” feature then the submitted text will remain unformatted.

 Signature 
Posted: 04 February 2008 03:37 PM   [ # 6 ]
Avatar
Joined: 2008-01-15
134 posts
Derek Jones - 04 February 2008 03:13 PM

Yes, if you disable the “Auto-convert URLs and email addresses into links?” feature then the submitted text will remain unformatted.

Well I want that feature enabled without distorting my URLs.  It even distorts it within the “code” formatting.

Also, it distorts it on the official EE forums (right here) and you have auto-converting disabled, so that isn’t the problem.  I had to insert spaces in between each character in my original post so that it didn’t change the characters.

 Signature 

Lone Shadow Gaming Community
LSGC Forums

Posted: 04 February 2008 03:57 PM   [ # 7 ]
Avatar
Joined: 2002-06-03
6840 posts

True, my apologies.  Though if you just submit it as you have in the first post, no such conversion occurs.

http://en.wikipedia.org/wiki/Steppenwolf_(band)

It’s only with URL encoded values that you are seeing the conversion.  Let me bring this topic up again with the rest of the development team and see if they have any thoughts as to a safe workaround for you.

 Signature 
Posted: 04 February 2008 04:16 PM   [ # 8 ]
Avatar
Joined: 2008-01-15
134 posts
Derek Jones - 04 February 2008 08:57 PM

True, my apologies.  Though if you just submit it as you have in the first post, no such conversion occurs.

http://en.wikipedia.org/wiki/Steppenwolf_(band)

It’s only with URL encoded values that you are seeing the conversion.  Let me bring this topic up again with the rest of the development team and see if they have any thoughts as to a safe workaround for you.

Thanks for the help.  And the reason I provided the link with the percentages is that’s because Wikipedia gives you that link.  I just happen to know you could change them to parenthesis and net the same effect but my members aren’t going to come up with this and are just going to end up with a broken link.

 Signature 

Lone Shadow Gaming Community
LSGC Forums

Posted: 04 February 2008 04:58 PM   [ # 9 ]
Avatar
Joined: 2002-06-03
6840 posts

Ok, we have made a determination and future versions will be able to display the URLs without the character conversion as plain text, but I’m afraid it’s just not safe to make the default behavior for pMcode and auto-created links to allow parenthesis of any kind in the URL.  Even with this future change, you would need to allow all HTML in your forums and users would need to create regular HTML links.

You may email me (not PM) if you would like the modified file that will at least prevent the characters from being converted to invalid character entities in plain text.

 Signature 
Posted: 26 November 2009 02:09 AM   [ # 10 ]
Joined: 2007-05-30
179 posts

Curiously, for site owners willing to take the risk of allowing parentheses in URIs, are there sufficient hooks around EE’s URI cleaning functions to rewrite the functionality via extension?

Posted: 26 November 2009 02:11 AM   [ # 11 ]
Avatar
Joined: 2002-06-03
6840 posts

No, Michael, sorry, such systems are not exposed to extensions so that every unmodified installation of ExpressionEngine can be assured to have certain protections.  For your own installations, of course, you are certainly free to modify the code as you see fit, but particularly in the case of application security measures, such changes are at your own risk and not supported.

 Signature 
Posted: 26 November 2009 03:03 PM   [ # 12 ]
Joined: 2007-05-30
179 posts

Sorry to be beating this dead horse, but…

I just noticed the line in config.php about uri characters:

$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\\-';

Can I modify this variable to allow parenthesis in blog and wiki titles?

Posted: 26 November 2009 03:04 PM   [ # 13 ]
Avatar
Joined: 2002-06-03
6840 posts

No, that config item is not used by ExpressionEngine.

 Signature 
Posted: 30 November 2009 08:03 PM   [ # 14 ]
Joined: 2007-05-30
179 posts

Interestingly, John Gruber just released his ideal regex for matching URLs, which does account for a set of parentheses in a URL:

http://daringfireball.net/2009/11/liberal_regex_for_matching_urls

Nifty.

Posted: 30 November 2009 08:09 PM   [ # 15 ]
Avatar
Joined: 2002-06-03
6840 posts

Yes, that’s a handy pattern for automatically parsing URLs from strings.  But our reason for not including parenthesis is not because we are not sharp enough to figure out the regex pattern.

 Signature 
1 of 2
1