Substrings trigger glossary definitions

Resolved vebmayster
(@vebmayster)

9 years, 10 months ago

Love the plugin! One question: I use my glossary on an educational website for students reading literary texts in a particular foreign language. I noticed that if a word not defined in my glossary contains (usually coincidentally) a word that is defined in my glossary, that substring of the word will trigger a glossary link. For example, say I had “stand” defined in my glossary and then a blog post contained the word “understand,” the “stand” part of “understand” becomes hoverable as a glossary term. (Obviously, this can give rise to some misunderstandings!) Is there any way that I can go into the code and make sure that the glossary feature is only triggered by exact matches of whole strings defined in the glossary entry? Would I need to modify the regular expressions used in the search function? Not sure where that function lives in the code (I’m an elementary programmer), but I do have FTP access to my site and can go in and change it with some guidance. Thanks in advance!

https://ww.wp.xz.cn/plugins/glossary-by-codeat/

Viewing 8 replies - 1 through 8 (of 8 total)

Plugin Author Eugenio Petulla’
(@igenius)

9 years, 10 months ago

Hi @vebmayster, thank you for the support! 🙂

I resolved this issue some release ago. Did you upgraded your plugin to the last version?

We’ve got the term “code” in our glossary and it doesn’t shows up into words like “coders” or “decode”.

Take a look at our demo site here: https://codeat.co/glossary/hello-world/

and the term here: https://codeat.co/glossary/glossary/code/

If your plugin is at the last version maybe is better than you post us a link of the issue on your blog because in the regular expression we use there is a rule that looks for a space or a special char before the term and if there aren’t the match doesn’t occur. 🙂

Thread Starter vebmayster
(@vebmayster)

9 years, 10 months ago

Thanks for the response. I just downloaded the plug-in to explore how it works, so it’s up to date. You’re right, the problem must be because the previous character in this particular example is a diacritic mark, and is being treated as a space? I’m not using the Latin alphabet, so all the trickier.

Here is the page: http://leyenzal.org/gonshor-perets3/
The word “באַהאַנדלען” (in the second or so line of the post) is triggering the definition for “האַנדלען”. I suspect it is because the previous character “אַ” is composed of two Unicode chars, א and the small line underneath.

Could you provide some input as to where I could manually add in the list of these diacritic characters, so that they’re not treated as a boundary between adjacent strings? (There are only a few of them.) Thanks!

Plugin Author Eugenio Petulla’
(@igenius)

9 years, 10 months ago

Actually is very difficoult to me understand the problem because of the language (is it Hebrew, right?) but I’m working on a tweak that could help avoid situation like this. If you can do some test for me I will appreciate it a lot! 🙂

Try to edit line 191 of the public/includes/Glossary_Tooltip_Engine.php file. Delete this part \:\;\*\"\)\!\?\/\%\$\£\|\^\<\> and save, than tell me if something change. 🙂

If it will resolve your issue we can add a special option into the settings panel. 😀

Thread Starter vebmayster
(@vebmayster)

9 years, 10 months ago

Doesn’t seem to help.
I’ve determined that the problem isn’t the diacritic under the letter. It happens with any character.
To explain, if we have a word like “xyz” defined in our glossary, then “wxyz” will result in “xyz” triggering a definition. “wxyza” will not, however.
I made a test page: http://leyenzal.org/regex-test/

So it’s treating any preceding Hebrew character (whether a full character or a diacritic) as if it’s a word boundary. But it correctly understands that subsequent letters mean we’re dealing with a different word.
(By the way, the alphabet is read right to left, but that shouldn’t matter because the website is fully formatted RTL. So “preceding” means to the right.)

Thread Starter vebmayster
(@vebmayster)

9 years, 10 months ago

Is there a way to specify some range or list of Unicode characters that, when they appear before a potential glossary word, mean that this word is actually different from the glossary words? I.e., to instruct the program not to treat preceding characters as spaces. Thanks!

Plugin Author Eugenio Petulla’
(@igenius)

9 years, 10 months ago

Thank you for the link, it was very helpful. I’m working on the solution today. Keep following this thread and I will notify you when everything is ready! 🙂

Thread Starter vebmayster
(@vebmayster)

9 years, 10 months ago

Thanks!
Since this problem probably won’t arise too often for me/for others, maybe there’s another solution. Is there any way to mark off a particular word while editing a post/page, so that it doesn’t come up as a match for a defined glossary entry (i.e., overriding the default behavior on a one-case basis)? Maybe wrapping the word in brackets or something, or between HTML tags that have no function? This would be useful not only to solve my problem, but in general, so that a word that is spelled like a glossary entry but is otherwise distinct won’t come up with the wrong definition. (Like, if “bank” is defined as the “side of a river,” then a sentence “deposit the check at the <bank>” wouldn’t have “bank” matched with our glossary entry.)

Plugin Author Eugenio Petulla’
(@igenius)

9 years, 10 months ago

There is a way, simply put the term between a span tag using the text editor functionality of WordPress. Follow this example:

Bank = Parsed, so you have the tooltip

<span>Bank</span> = No parse

No shortcodes or other stuff required. 🙂

It’s a tweak and it will work also in your specific case:

<span>Bank</span>er

Otherwise I’m trying to resolve the issue at the bottom level of the problem, unicode chars, so keep following this thread. 😀

Viewing 8 replies - 1 through 8 (of 8 total)

The topic ‘Substrings trigger glossary definitions’ is closed to new replies.